# bigdataProject
**Repository Path**: kevin-beam/bigdata-project
## Basic Information
- **Project Name**: bigdataProject
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-12-13
- **Last Updated**: 2025-01-02
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# 抖音用户数据可视化分析
## 数据来源
https://www.heywhale.com/mw/project/62dea26975127f402e63a9f8
https://www.heywhale.com/mw/project/6711df0d739a15610f8fb74b/dataset
https://www.heywhale.com/mw/dataset/661118b2d37810d31dac577e/file
https://opendata.pku.edu.cn/file.xhtml?fileId=6667&datasetVersionId=587
## 数据集
### 小米su7抖音评论
| ip_label | pl | pl_count | is_image | user_id | love | pl_time |
| -------- | ---- | -------- | ------------------------------------------ | ------- | ---------- | -------- |
| 地区 | 评论 | 评论数量 | 评论图片
1 表示包含
0 表示不包含 | 用户 id | 评论点赞数 | 评论时间 |
### 抖音网红排行榜
| 抖音号 | 昵称 | 性别 | 个人签名 | 城市 | 生日 | 星座 | 蓝V认证原因 | 粉丝数 | 获赞数 | 视频数 | 采集时间 | 详细数据 |
| ------ | ---- | ---- | -------- | ---- | ---- | ---- | ----------- | ------ | ------ | ------ | -------- | -------- |
### 奥运健儿抖音相关数据
| 达人人名称 | 达人链接 | 达人标签 | 达人类型 | 粉丝数 | 近7日作品数 | 上周总评论数 | 上周总点赞数 | 账号简介 | 认证信息 | 作品数 | 所属ip | 总获赞数 | 近30天新增作品数 | 近30天新增粉丝数 | 近30天互动平均数 | 近30天互动中位数 | 近30天爆文数 |
| ---------- | -------- | -------- | -------- | ------ | ----------- | ------------ | ------------ | -------- | -------- | ------ | ------ | -------- | ---------------- | ---------------- | ---------------- | ---------------- | ------------ |
### 上传原始数据到HDFS
```bash
hdfs dfs -mkdir -p /douyin-data/ods
```
```bash
hdfs dfs -put xiaomisu7.csv /douyin-data/ods
hdfs dfs -put douyindataset.csv /douyin-data/ods
```
### 创建 hive 数据库
```sql
create database douyin;
```
## springboot工程
### 新建工程


### pom.xml
```
4.0.0
org.springframework.boot
spring-boot-starter-parent
2.7.18
cn.lhz
tour-analysis-hive
0.0.1
tour-analysis-hive
tour-analysis-hive
21
21
${jdk.version}
${jdk.version}
${jdk.version}
utf-8
utf-8
UTF-8
true
true
2.17.0
3.17.0
1.2.23
2.1.4
2.0.53
2.0.53
2.11.0
1.1.2
5.8.32
5.11.2
2.0.0
2.18.0
2.9.18
1.18.34
3.5.9
3.0.4
2.7.18
org.springframework.boot
spring-boot-starter-thymeleaf
org.springframework.boot
spring-boot-starter-web
org.springframework.boot
spring-boot-starter-tomcat
com.baomidou
mybatis-plus-boot-starter
${mybatis-plus.version}
com.baomidou
mybatis-plus
${mybatis-plus.version}
com.baomidou
mybatis-plus-jsqlparser
${mybatis-plus.version}
org.springframework.boot
spring-boot-configuration-processor
true
org.projectlombok
lombok
true
org.springframework.boot
spring-boot-starter-test
test
cn.hutool
hutool-all
${hutool.version}
com.github.binarywang
java-testdata-generator
1.1.2
org.apache.commons
commons-lang3
${commons-lang3.version}
commons-io
commons-io
${commons-io.version}
com.google.code.gson
gson
${gson.version}
org.webjars
layui
${layui.version}
com.alibaba
druid
${druid.version}
io.springfox
springfox-boot-starter
3.0.0
io.springfox
springfox-swagger-ui
3.0.0
com.github.xiaoymin
swagger-bootstrap-ui
1.9.6
com.github.xiaoymin
knife4j-spring-ui
3.0.3
org.apache.hive
hive-jdbc
3.1.3
ch.qos.logback
logback-classic
org.apache.logging.log4j
log4j-slf4j-impl
org.slf4j
slf4j-log4j12
org.eclipse.jetty.orbit
*
org.eclipse.jetty.aggregate
*
tomcat
*
javax.servlet
servlet-api
org.mortbay.jetty
*
org.apache.maven.plugins
maven-compiler-plugin
3.13.0
UTF-8
${jdk.version}
${jdk.version}
org.springframework.boot
spring-boot-maven-plugin
org.projectlombok
lombok
${project.name}
public
aliyun nexus
https://maven.aliyun.com/repository/public
true
public
aliyun nexus
https://maven.aliyun.com/repository/public
true
false
```
### 配置文件
> application.yml
```yml
server:
port: 8080
servlet:
context-path: /api
spring:
application:
<<<<<<< HEAD
name: douyin-analysis-hive
mvc:
path match:
matching-strategy: ant_path_matcher
=======
name: douyin-analysis-presto
>>>>>>> f5a20945be5e93fadca75af3d7977396951ef1e0
jackson:
serialization:
FAIL_ON_EMPTY_BEANS: false
# json 序列化排除值为 null 的属性
default-property-inclusion: non_null
# 配置 Date 类的时间格式,如果不涉及可以不加
date-format: yyyy-MM-dd HH:mm:ss
# 配置 Date 类的时区,如果不涉及可以不加
time-zone: GMT+8
thymeleaf:
#缓冲的配置
cache: false
check-template: true
check-template-location: true
#开启MVC thymeleaf 视图解析
enabled: true
#模板的模式,支持 HTML, XML TEXT JAVASCRIPT
mode: HTML5
#编码 可不用配置
encoding: UTF-8
#配置模板路径,默认是templates,可以不用配置
prefix: classpath:templates
# 文件后缀名
suffix: .html
#内容类别,可不用配置
servlet:
content-type: text/html;charset=utf-8
datasource:
type: com.alibaba.druid.pool.DruidDataSource
<<<<<<< HEAD
driver-class-name: org.apache.hive.jdbc.HiveDriver
url: jdbc:hive2://lihaozhe03:10000/lihaozhe
=======
driver-class-name: com.facebook.presto.jdbc.PrestoDriver
url: jdbc:presto://lihaozhe01:8085/hive/douyin-analysis
>>>>>>> f5a20945be5e93fadca75af3d7977396951ef1e0
username: root
password:
mybatis-plus:
# 如果是放在src/main/java目录下 classpath:/com/yourpackage/*/mapper/*Mapper.xml
# 如果是放在resource目录 classpath:/mapper/*Mapper.xml
mapper-locations: classpath:mapper/*.xml
#实体扫描,多个package用逗号或者分号分隔
type-aliases-package: cn.lhz.dto,cn.lhz.vo
configuration:
#配置返回数据库(column下划线命名&&返回java实体是驼峰命名),自动匹配无需as(没开启这个,SQL需要写as: select user_id as userId)
map-underscore-to-camel-case: true
cache-enabled: false
#配置JdbcTypeForNull, oracle数据库必须配置
jdbc-type-for-null: 'null'
log-impl: org.apache.ibatis.logging.stdout.StdOutImpl
#是否激活 swagger true or false
springfox:
documentation:
enabled: true
```
### 工具类
#### util 包下的工具类
#### MybatisPlus配置类
```java
package cn.lhz.config;
import com.baomidou.mybatisplus.annotation.DbType;
import com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor;
import com.baomidou.mybatisplus.extension.plugins.inner.PaginationInnerInterceptor;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* @author
* @version
*/
@Configuration
public class MybatisPlusConfig {
@Bean
public MybatisPlusInterceptor mybatisPlusInterceptor() {
MybatisPlusInterceptor interceptor = new MybatisPlusInterceptor();
// 于 v3.5.9 起,PaginationInnerInterceptor 已分离出来。如需使用,则需单独引入 mybatis-plus-jsqlparser 依赖
interceptor.addInnerInterceptor(new PaginationInnerInterceptor(DbType.PRESTO));
return interceptor;
}
}
```
#### 全局异常处理
```java
package cn.lhz.config;
import cn.lhz.util.response.ResponseResult;
import cn.lhz.util.response.ResultCode;
import org.springframework.web.bind.annotation.ControllerAdvice;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;
/**
* 全局异常处理
*
* @author
* @version
*/
@ControllerAdvice
@RestControllerAdvice
public class GlobalException {
@ExceptionHandler(Exception.class)
public ResponseResult defaultErrorHandler(Exception e) {
return new ResponseResult<>(ResultCode.EXCEPTION.getCode(), e.getMessage());
}
}
```
### idea连接hive
#### 添加驱动

#### 配置数据源

## ODS
**hive 创建抖音用户数据分析库**
```
-- 创建抖音用户数据分析库
drop database if exists douyin_analysis cascade;
create database if not exists douyin_analysis;
use douyin_analysis;
```
### hive创建小米su7数据表
```
use douyin_analysis;
-- ODS层创建小米su7数据表
drop table if exists douyin_analysis.ods_xiaomisu7;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.ods_xiaomisu7
(
ip_label STRING COMMENT '地区',
pl STRING COMMENT '评论',
pl_count INT COMMENT '评论数量',
is_image INT COMMENT '评论图片 1 表示包含 0 表示不包含',
user_id STRING COMMENT '用户 id',
love INT COMMENT '评论点赞数',
pl_time STRING COMMENT '评论时间'
) COMMENT '小米su7抖音评论'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/1-ODS/xiaomisu7';
-- 加载本地数据到数据表
load data local inpath '/root/data/xiaomisu7.csv' overwrite into table douyin_analysis.ods_xiaomisu7;
-- 查询数据
select * from ods_xiaomisu7;
select count(*) as total from ods_xiaomisu7;
```
### hive创建抖音网红排行榜
```
use douyin_analysis;
--ODS创建抖音网红排行表
drop table if exists douyin_analysis.ods_douiynwanghong;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.ods_douyinwanghong
(
id STRING COMMENT '抖音号',
name STRING COMMENT '昵称',
gender STRING COMMENT '性别',
signature STRING COMMENT '个人签名',
city STRING COMMENT '城市',
birthday STRING COMMENT '生日',
constellation STRING COMMENT '星座',
Blue_V STRING COMMENT '蓝v认证原因',
number_fans STRING COMMENT '粉丝数',
number_likes INT COMMENT '获赞数',
number_videos STRING COMMENT '视频数',
is_time STRING COMMENT '采集时间',
data STRING COMMENT '详细数据'
) COMMENT '抖音网红排行表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '.'
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/1-ODS/douyinwanghong';
-- 加载本地数据到数据表
load data local inpath '/root/data/douyinwanghong.csv' overwrite into table douyin_analysis.ods_douyinwanghong;
-- 查询数据
select * from douyin_analysis.ods_douyinwanghong;
select count(*) from douyin_analysis.ods_douyinwanghong;
```
### hive创建奥运健儿抖音相关数据
```
use douyin_analysis;
-- 创建奥运健儿抖音数据表
drop table if exists douyin_analysis.ods_aoyun;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.ods_aoyun
(
i_name STRING COMMENT '达人名称',
i_link STRING COMMENT '达人链接',
i_type STRING COMMENT '标签类型',
f_type STRING COMMENT '粉丝类型',
f_number_last_week STRING COMMENT '近7日粉丝数',
f_number_week STRING COMMENT '周总粉丝数',
f_comments_week STRING COMMENT '周总评价数',
f_likes_week STRING COMMENT '周总点赞数',
profile STRING COMMENT '简介',
certification STRING COMMENT '认证信息',
works STRING COMMENT '作品信息',
ip STRING COMMENT '所属IP',
f_number_last_month STRING COMMENT '近30日新增粉丝数',
f_likes_last_month STRING COMMENT '近30日新增点赞数',
f_comments_last_month STRING COMMENT '近30日新增评论数',
interaction FLOAT COMMENT '互动指数',
balance STRING COMMENT '粉丝均衡度',
chinese_works STRING COMMENT '中文作品数'
) COMMENT '奥运健儿抖音数据表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '.'
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/1-ODS/aoyun';
-- 加载本地数据到数据表
load data local inpath '/root/data/aoyun.csv' overwrite into table douyin_analysis.ods_aoyun;
-- 查询数据
select * from douyin_analysis.ods_aoyun;
select count(*) from douyin_analysis.ods_aoyun;
```
## DWD
### hive创建小米su7数据表格
```
drop table if exists douyin_analysis.dwd_xiaomisu7;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwd_xiaomisu7
(
ip_label STRING COMMENT '地区',
pl STRING COMMENT '评论',
pl_count INT COMMENT '评论数量',
is_image INT COMMENT '评论图片 1 表示包含 0 表示不包含',
user_id STRING COMMENT '用户 id',
love INT COMMENT '评论点赞数',
pl_time STRING COMMENT '评论时间'
) COMMENT '小米su7抖音评论'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/2-DWD/xiaomisu7';
-- ETL 处理 ODS层空数据
insert overwrite table douyin_analysis.dwd_xiaomisu7
select
coalesce(ip_label, 'unknown') as ip_label,
coalesce(pl, '') as pl,
coalesce(pl_count, 0) as pl_count,
coalesce(is_image, 0) as is_image,
coalesce(user_id, 'unknown') as user_id,
coalesce(love, 0) as love,
coalesce(pl_time, 'unknown') as pl_time
from douyin_analysis.ods_xiaomisu7
where ip_label is not null
and pl is not null
and pl_count is not null
and is_image is not null
and user_id is not null
and love is not null
and pl_time is not null;
```
### hive创建抖音网红排行表
```
--创建DWD层抖音网红排行表
drop table if exists douyin_analysis.dwd_douyinwanghong;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwd_douyinwanghong
(
id STRING COMMENT '抖音号',
name STRING COMMENT '昵称',
gender STRING COMMENT '性别',
signature STRING COMMENT '个人签名',
city STRING COMMENT '城市',
birthday STRING COMMENT '生日',
constellation STRING COMMENT '星座',
Blue_V STRING COMMENT '蓝v认证原因',
number_fans STRING COMMENT '粉丝数',
number_likes INT COMMENT '获赞数',
number_videos STRING COMMENT '视频数',
is_time STRING COMMENT '采集时间',
data STRING COMMENT '详细数据'
) COMMENT '抖音网红排行表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/2-DWD/douyinwanghong';
-- ETL 处理 ODS层空数据
insert overwrite table douyin_analysis.dwd_douyinwanghong
select
coalesce(id, 'unknown') as id,
coalesce(name, '') as name,
coalesce(gender, 'unknown') as gender,
coalesce(signature, '') as signature,
coalesce(city, 'unknown') as city,
coalesce(birthday, 'unknown') as birthday,
coalesce(constellation, 'unknown') as constellation,
coalesce(Blue_V, '') as Blue_V,
coalesce(number_fans, '0') as number_fans,
coalesce(number_likes, 0) as number_likes,
coalesce(number_videos, '0') as number_videos,
coalesce(is_time, 'unknown') as is_time,
coalesce(data, '') as data
from douyin_analysis.ods_douyinwanghong
where id is not null
and name is not null
and gender is not null
and signature is not null
and city is not null
and birthday is not null
and constellation is not null
and Blue_V is not null
and number_fans is not null
and number_likes is not null
and number_videos is not null
and is_time is not null
and data is not null;
```
### hive创建奥运健儿抖音相关数据
```
--创建DWD层奥运健儿抖音数据表
drop table if exists douyin_analysis.dwd_aoyun;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwd_aoyun
(
i_name STRING COMMENT '达人名称',
i_link STRING COMMENT '达人链接',
i_type STRING COMMENT '标签类型',
f_type STRING COMMENT '粉丝类型',
f_number_last_week STRING COMMENT '粉丝数',
f_number_week STRING COMMENT '周总粉丝数',
f_comments_week STRING COMMENT '周总评价数',
f_likes_week STRING COMMENT '周总点赞数',
profile STRING COMMENT '简介',
certification STRING COMMENT '认证信息',
works STRING COMMENT '作品信息',
ip STRING COMMENT '所属IP',
f_number_last_month STRING COMMENT '近30日新增粉丝数',
f_likes_last_month STRING COMMENT '近30日新增点赞数',
f_comments_last_month STRING COMMENT '近30日新增评论数',
interaction FLOAT COMMENT '互动指数',
balance STRING COMMENT '粉丝均衡度',
chinese_works STRING COMMENT '中文作品数'
) COMMENT '奥运健儿抖音数据表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '.'
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/1-ODS/douyindataset';
-- ETL 处理 ODS层空数据
-- 插入数据到目标表,进行ETL处理
insert overwrite table douyin_analysis.dwd_aoyun
select
coalesce(i_name, 'unknown') AS i_name,
coalesce(i_link, '') AS i_link,
coalesce(i_type, 'unknown') AS i_type,
coalesce(f_type, 'unknown') AS f_type,
coalesce(f_number_last_week, '0') AS f_number_last_week,
coalesce(f_number_week, '0') AS f_number_week,
coalesce(f_comments_week, 0) AS f_comments_week,
coalesce(f_likes_week, 0) AS f_likes_week,
coalesce(profile, '') AS profile,
coalesce(certification, '') AS certification,
coalesce(works, '') AS works,
coalesce(ip, 'unknown') AS ip,
coalesce(f_number_last_month, '0') AS f_number_last_month,
coalesce(f_likes_last_month, '0') AS f_likes_last_month,
coalesce(f_comments_last_month, '0') AS f_comments_last_month,
coalesce(interaction, 0.0) AS interaction,
coalesce(balance, 'unknown') AS balance,
coalesce(chinese_works, '0') AS chinese_works
from douyin_analysis.ods_aoyun
where i_name is not null
AND i_link is not null
AND i_type is not null
AND f_type is not null
AND f_number_last_week is not null
AND f_number_week is not null
AND f_comments_week is not null
AND f_likes_week is not null
AND profile is not NULL
AND certification is not null
AND works is not null
AND ip is not null
AND f_number_last_month is not null
AND f_likes_last_month is not null
AND f_comments_last_month is not null
AND interaction is not null
AND balance is not null
AND chinese_works is not null;
```
<<<<<<< HEAD
## DWT
### 1.xiaomusu7评论ip属地
```
DROP TABLE IF EXISTS douyin_analysis.dwt_xiaomisu7_ip_count
CREATE TABLE IF NOT EXISTS douyin_analysis.dwt_xiaomisu7_ip_count
(
ip_label STRING COMMENT '地区',
count INT COMMENT '出现次数'
) COMMENT '小米su7抖音评论地区出现次数';
INSERT OVERWRITE TABLE douyin_analysis.dwt_xiaomisu7_ip_count
SELECT
ip_label,
COUNT(*) as count
FROM douyin_analysis.dwd_xiaomisu7
GROUP BY ip_label;
```
### 2.奥运健儿相关信息
```
DROP TABLE IF EXISTS douyin_analysis.dwt_aoyun_info
-- 创建DWT层表
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwt_aoyun_info
(
i_name STRING COMMENT '达人名称',
certification STRING COMMENT '认证信息'
) COMMENT '奥运健儿抖音达人名称与认证信息'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/2-DWT/aoyun_info';
-- 插入数据到DWT层表
INSERT OVERWRITE TABLE douyin_analysis.dwt_aoyun_info
SELECT
i_name,
certification
FROM douyin_analysis.dwd_aoyun;
-- 如果表存在则删除
DROP TABLE IF EXISTS douyin_analysis.dwt_aoyun_info;
-- 创建DWT层表
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwt_aoyun_info (
i_name STRING COMMENT '达人名称',
certification STRING COMMENT '认证信息',
ip STRING COMMENT 'IP地址',
f_number_last_month INT COMMENT '上月粉丝数',
f_likes_last_month INT COMMENT '上月点赞数'
) COMMENT '奥运健儿抖音达人名称与认证信息'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/2-DWT/aoyun_info';
-- 插入数据到DWT层表
INSERT OVERWRITE TABLE douyin_analysis.dwt_aoyun_info
SELECT
i_name,
certification,
ip,
f_number_last_month,
f_likes_last_month
FROM douyin_analysis.dwd_aoyun;
```
### 3.奥运健儿粉丝数
```
-- 创建 DWT 层奥运健儿抖音数据表
drop table if exists douyin_analysis.dwt_aoyun;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwt_aoyun
(
i_name STRING COMMENT '达人名称',
total_f_number_last_week BIGINT COMMENT '按达人名称汇总的上周粉丝总数'
) COMMENT '奥运健儿抖音 DWT 层汇总表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '.'
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/2-DWT/douyindataset';
-- ETL 处理,从 DWD 层插入数据到 DWT 层
insert overwrite table douyin_analysis.dwt_aoyun
select
i_name,
sum(f_number_last_week) as total_f_number_last_week
from douyin_analysis.dwd_aoyun
group by i_name;
```
### 4.抖音网红性别,星座分布
```
-- 创建DWT层抖音网红部分信息表
drop table if exists douyin_analysis.dwt_douyinwanghong_partial;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwt_douyinwanghong_partial
(
name STRING COMMENT '昵称',
gender STRING COMMENT '性别',
constellation STRING COMMENT '星座'
) COMMENT '抖音网红部分信息表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/3-DWT/douyinwanghong_partial';
-- ETL处理,从DWD层表中筛选数据并插入DWT层表
insert overwrite table douyin_analysis.dwt_douyinwanghong_partial
select
coalesce(name, '') as name,
coalesce(gender, 'unknown') as gender,
coalesce(constellation, 'unknown') as constellation
from douyin_analysis.dwd_douyinwanghong;
```
### 5.抖音网红性别数量表
```
-- 创建DWT层抖音网红性别数量表
drop table if exists douyin_analysis.dwt_douyinwanghong_gender_count;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwt_douyinwanghong_gender_count
(
gender STRING COMMENT '性别',
count INT COMMENT '该性别网红数量'
) COMMENT '抖音网红性别数量表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/3-DWT/douyinwanghong_gender_count';
-- ETL处理,从dwt_douyinwanghong_partial表中统计不同性别数量并插入dwt_douyinwanghong_gender_count表
insert overwrite table douyin_analysis.dwt_douyinwanghong_gender_count
select
gender,
count(*) as count
from douyin_analysis.dwt_douyinwanghong_partial
group by gender;
```
### 6.抖音网红星座数量表
```
-- 创建DWT层抖音网红星座数量表
drop table if exists douyin_analysis.dwt_douyinwanghong_constellation_count;
CREATE EXTERNAL TABLE IF NOT EXISTS douyin_analysis.dwt_douyinwanghong_constellation_count
(
constellation STRING COMMENT '星座',
count INT COMMENT '该星座网红数量'
) COMMENT '抖音网红星座数量表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION '/douyin_analysis/3 - DWT/douyinwanghong_constellation_count';
-- ETL处理,从dwt_douyinwanghong_partial表中统计不同星座数量并插入dwt_douyinwanghong_constellation_count表
insert overwrite table douyin_analysis.dwt_douyinwanghong_constellation_count
select
constellation,
count(*) as count
from douyin_analysis.dwt_douyinwanghong_partial
group by constellation;
```
## api数据接口
### 1.xaiomusu7抖音用户评论分布地图
#### dto
```bash
package cn.lhz.dto;
import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.Setter;
/**
* @author 牛枥凯
* @version 1.0.0 2024/12/27 上午11:26
*/
@Setter
@Getter
@NoArgsConstructor
@AllArgsConstructor
public class DwdXiaomisu7 {
private String i_name;
private String i_link;
private String i_type;
private String f_type;
private int f_number_last_week;
private int f_number_week;
private int f_comments_week;
private int f_likes_week;
private String profile;
private String certification;
private int works;
private String ip;
private int f_number_last_month;
private int f_likes_last_month;
private int f_comments_last_month;
private int interaction;
private int balance;
private int chinese_works;
}
```
#### Mapper
```bash
package cn.lhz.mapper;
import cn.lhz.dto.DwdXiaomisu7;
import com.baomidou.mybatisplus.core.mapper.BaseMapper;
import org.apache.ibatis.annotations.Mapper;
/**
* @author 牛枥凯
* @version 1.0.0
*/
@Mapper
public interface Xiaomisu7Mapper extends BaseMapper {
}
```
#### Service
```bash
package cn.lhz.service;
import cn.lhz.dto.DwdXiaomisu7;
import com.baomidou.mybatisplus.extension.service.IService;
/**
* @author 牛枥凯
* @version 1.0.0
*/
public interface Xiaomisu7Service extends IService {
}
```
#### Servicelmpl
```bash
package cn.lhz.service.impl;
import cn.lhz.dto.DwdXiaomisu7;
import cn.lhz.mapper.Xiaomisu7Mapper;
import cn.lhz.service.Xiaomisu7Service;
import com.baomidou.mybatisplus.extension.service.impl.ServiceImpl;
import org.springframework.stereotype.Service;
@Service
public class Xiaomisu7ServiceImpl extends ServiceImpl implements Xiaomisu7Service {
}
```
#### Controller
```bash
package cn.lhz.controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
@RestController
@RequestMapping("/xiaomisu7")
public class Xiaomisu7Controller {
@GetMapping("/dwtInfo")
public List