Soonner or later, you are going to write some SQL query during the process of job search. This article tries to summarize some common SQL techniques, heuristics that can be applied to querying problem solving.
VERY BASIC STUFF
The syntax in SQL is extremely simple. Know it, memorize it and implement it as back of your hand.
- select (if) count(distinct()) case () when then () else () end sd ()
- from () join() union() a horizontal version of join()
- where(and)(or)
- group by
- having
- order by
- limit
- offset
Think yourself are in a excel window, this can be auxiliary to solving a problem.
The query is to subset data, select helps you to decide the column, where helps you to specify the role, group by helping you to know the descriptive statistics and limit offset are often used together to tackle problems such as find the second largest or third largest customer(student) or whatever unit.
Knowing all these can help you get past the first simple question in the interview. Then it's time to do sub query and common table expressions(CTE).
SUBQUERY AND COMMON TABLE EXPRESSIONS
Subquery can appear in any where in your general sentence. It can be nested in select, from, where and having. CTEs and subqueries allow you to grab a subset of data and store that data with a name, which you can then select from and perform more operations on.
The strategy of thinking backwards and wishful thinking can be very useful in this stage.
Always try to ask yourself how I can finish this problem by a single query. Then developing from there, try to create all kinds of temporary tables you need in this single query. Finally put there parts together into just one query via subqueries.
EFFICIENCY PROBLEMS
Make sure you filter and aggregate data before you join the tables, this will make your query efficiency.
WINDOW FUNCTIONS
Make sure you understand window functions. Use PostgreSQL to finish the challenge if you can. MySQL 5.6 sucks because it can't support window functions.
Window function is just like group by. It separates your data into different chunks and allows you to operate on each chunk individually. But unlike group by, rows are not combined.
Every time you need to do arithmetic operations on a individual value with an aggregated value, you should definitely think of window functions.
The common window functions can be
- sum() over(partition by())
- avg() over(partition by())
- rank() over(partition by())
- lag() over(partition by())
- first_value() over(partition by())
ASSORTED HEURISTICS
If you get stuck at solving query problems, you can always try these things to move on a little bit:
- Try to write a query on the smaller pieces of data.
- Ask yourself how I can solve this problem in a single query?
- Can I filter, aggregate or join some tables?
- If you really get stuck at some extremely hard problem, try to solve this problem in excel using mouse, graphic interface, pivot table, sorting and then reimplement it in sql. I know it sounds silly but it really works for very difficult problems.
PRACTICE WORKFLOW
The best way to learn is to test yourself ruthlessly.
- Make sure you understand the problems
- Attack those problems using your own developed heuristics(Mine is the above four)
- If you can't solve it more than 20 minutes, give up and find answers on Google!
- Make sure you understand it by playing with it on sql.fiddle.
- Re implement them from your memory until it gets correct and passed.
- Record it to your lift and revisit it using spacing repetition test techniques.
FINAL WORDS
- The best way to improve your SQL skills at the very beginning is creating an environment that you can see the instant visual feedback(SQL fiddle), SQLZOO has a strong interface to see the visual graphics of your query instantly.
- Free recall is a powerful weapon in improving any skills, free recall your knowledge in a spaced repetition manner. It's painful but it's a roadway to become a beast and it's definitely a good form of pain.
You can visit my website to know more things on data science