生成2023年节假日/工作日维表

阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6

项目中有一张维表维护的是历史节假日工作日的信息估计在很多场合都有类似的需求。到了新年需要生成新一年的数据下面看看如何在维表中插入新一年的数据。

1.查询节假日

根据国务院发布的休假信息查询对应的节假日信息并做保存。

	val holidays = Map("20230101" -> "元旦", "20230102" -> "元旦", "20230121" -> "春节", "20230122" -> "春节",
		"20230123" -> "春节", "20230124" -> "春节", "20230125" -> "春节", "20230126" -> "春节", "20230127" -> "春节",
		"20230405" -> "清明",
		"20230429" -> "五一", "20230430" -> "五一", "20230501" -> "五一", "20230502" -> "五一", "20230503" -> "五一",
		"20230622" -> "端午", "20230623" -> "端午", "20230624" -> "端午",
		"20230929" -> "国庆", "20230930" -> "国庆", "20231001" -> "国庆", "20231002" -> "国庆",
		"20231003" -> "国庆", "20231004" -> "国庆", "20231005" -> "国庆", "20231006" -> "国庆"
	)

2.查询特殊的工作日

目前国内的节假日一般都会有原本是周末的日期进行调休凑假期同样也是根据国务院发布的信息查询出对应的日期并做保存。

val workdays = Set("20230128", "20230129", "20230423", "20230506", "20230625", "20231007", "20231008")

3.根据维表建立相应的case class

根据维表的信息(已经在数据库中建表根据相应的表结构来确定case class结构)新建case class。

case class DimHolidayInfo(date: Int, date_s: String, is_holiday: Byte, holiday_name: String, is_workday: Byte, day_of_week: Byte, day_of_week_c: String);

	val weekdays : Map[String, Byte] = Map(
			"星期一" -> 1,
			"星期二" -> 2,
			"星期三" -> 3,
			"星期四" -> 4,
			"星期五" -> 5,
			"星期六" -> 6,
			"星期日" -> 7)

4.写入数据

	
	def GenData(spark: SparkSession, output: String) = {
		import spark.implicits._
		
		val (startdate, enddate) = ("20230101", "20231231")
		val dateset = TimeUtils.genYmdSet(startdate, enddate)
		
		val result: ArrayBuffer[DimHolidayInfo] = ArrayBuffer()
		for(each <- dateset) {
			val date = each.toInt
			val (year, month, day) = (each.substring(0, 4), each.substring(4, 6), each.substring(6, 8))
			val date_s = Array(year, month, day).mkString("-")
			val isholiday: Byte = if (holidays.contains(each)) 1 else 0
			val holidayname = if (holidays.contains(each)) holidays.getOrElse(each, "") else ""
			val isworkday = isWorkDay(each)
			val dayofweekc = TimeUtils.getWeekDay(each)
			val dayofweek: Byte = if (weekdays.contains(dayofweekc)) weekdays.getOrElse(dayofweekc, -1) else -1
			
			val obj = DimHolidayInfo(date, date_s, isholiday, holidayname, isworkday, dayofweek, dayofweekc)
			result.append(obj)
		}
		
		spark.sparkContext.parallelize(result, 1).toDF()
			.write
			.mode("overwrite")
			.parquet(output)
	}

其中TimeUtils.genYmdSet(startdate, enddate)生成一整年的时间序列TimeUtils.getWeekDay(each)生成是星期几。
代码最后保存为parquet数据类型。

阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6

“生成2023年节假日/工作日维表” 的相关文章