-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path01_top_demanded_skills.sql
More file actions
55 lines (49 loc) · 2.25 KB
/
01_top_demanded_skills.sql
File metadata and controls
55 lines (49 loc) · 2.25 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
/*
Question: What are the most in-demand skills for data engineers?
- Identify the top 10 in-demand skills for data engineers
- Focus on remote job postings
- Why? Retrieves the top 10 skills with the highest demand in the remote job market,
providing insights into the most valuable skills for data engineers seeking remote work
*/
SELECT
sd.skills,
COUNT(jpf.*) AS demand_count
FROM job_postings_fact AS jpf
INNER JOIN skills_job_dim AS sjd
ON jpf.job_id = sjd.job_id
INNER JOIN skills_dim AS sd
ON sjd.skill_id = sd.skill_id
WHERE job_work_from_home = TRUE and job_title_short = 'Data Engineer'
GROUP BY
sd.skills
ORDER BY demand_count DESC
LIMIT 10;
/*
Here's the breakdown of the most demanded skills for data engineers:
- SQL and Python are by far the most in-demand skills, with around 29,000 job postings each - nearly double the next closest skill.
- Cloud platforms round out the top skills, with AWS leading at ~18,000 postings, followed by Azure at ~14,000.
- Apache Spark completes the top 5 with nearly 13,000 postings, highlighting the importance of big data processing skills.
Key takeaways:
- SQL and Python remain the foundational skills for data engineers
- Cloud platforms (AWS, Azure) are critical for modern data engineering
- Big data tools like Spark continue to be highly valued
- Data pipeline tools (Airflow, Snowflake, Databricks) show growing demand
- Java and GCP round out the top 10 most requested skills
┌────────────┬──────────────┐
│ skills │ skills_count │
│ varchar │ int64 │
├────────────┼──────────────┤
│ sql │ 29221 │
│ python │ 28776 │
│ aws │ 17823 │
│ azure │ 14143 │
│ spark │ 12799 │
│ airflow │ 9996 │
│ snowflake │ 8639 │
│ databricks │ 8183 │
│ java │ 7267 │
│ gcp │ 6446 │
├────────────┴──────────────┤
│ 10 rows 2 columns │
└───────────────────────────┘
*/