Just like a lot of people think of data science in terms of tools (R, Python, SQL, etc), a lot of people think of learning data science in terms of resources.
However, the most important thing to figure out before you learn data science is the process, not the resources. With the right process, you can plug in any resources, but still keep learning.
Think about it this way: people hiring data scientists don’t care what courses you’ve taken, they care what you can do for their business. The key is to get yourself into a position where you can contribute meaningfully.
In order to get there, you have to do the following:
- Learn to enjoy analyzing data and finding insights
- Start building projects and a portfolio
- Present your insights to others and learn how to communicate
- Work with larger datasets and push your boundaries
This process involves self-motivation, finding what interests you, and working on projects. You can read more here, or see the process in action at Dataquest, where this is our teaching method.
The key is to figure out a process that motivates you and gets you to your learning goals. A great way to do this is to:
- Learn enough of the basics to start working on projects
- Find interesting datasets to analyze
- Start analyzing datasets and showing your results to others
- Keep doing more complex analysis
Along the way, you’ll naturally learn the skills you need (when you need them for your projects).
Let’s break this down by step:
Learn enough of the basics to start working on projects
This means learning some basic statistics and programming. Some good resources are:
- Codecademy — learn the basics of programming.
- Khan Academy — learn statistics and linear algebra.
- OpenIntro Stats — learn statistics.
- Dataquest — learn data science concepts, including data analysis, programming, working with databases, and machine learning.
- Automate the Boring Stuff with Python — learn how to use Python in a practical way.
Don’t spend too much time on learning the basics — focus on learning just enough to start working on projects.
Find interesting datasets to analyze
You can find interesting datasets to analyze in quite a few places:
You can find more resources here. The key is to make sure that you’re interested in the datasets. You have to be able to motivate yourself to build several projects.
Start analyzing datasets and showing your results to others
Start analyzing the datasets using tools like Jupyter Notebook and posting the results on a blog or Github. At Dataquest, we give you structured guidance while you’re building these projects. You can also try to replicate people’s blog posts from blogs here.
Try to find interesting patterns and make compelling visualizations. You’ll then be able to share your results, both in person and online, in communities like:
Keep doing more complex analysis
Make sure to keep pushing yourself to do more complex analysis. This can be increasing the size of the datasets, increasing the number of datasets, or using more complex techniques.
This is critical, as this ensures that you keep pushing your boundaries and learning.
The bottom line
If you follow the above process, you’ll be able to learn data science, and have a solid project portfolio. The key is to ensure that you’re working on things that interest you, and that you’re driven to do.
Good luck learning data science!