September 18, 2020
This is just a small collection of my frequently given advice to young or newer members of the analytics industry.
If nothing else - I’d start with this:
Data Analysis exists as a scientific exploration for the purpose of driving value for it’s audience. That value can be defined as a simple aggregation of observations or, hopefully, to drive better decision making. Everything else you do will become useless if the root goal is lost:
Drive value through analytical insight.
How to use this #
Each item in the subsequent sections is a tip of the iceberg question to point you in a direction - a hint to a place to start.
The intention is to update this list periodically but also to be non-specific enough to last for the foreseeable future.
Data Layers #
1st order concepts:
- What is my primary question to ask in this analysis? Will a given piece of data give me a solution?
- What assumptions am I starting from for a given analysis or dataset?
- Is answering that question within my capabilities or do I have “missing pieces”?
- How can I prioritize my data requests?
2nd order concepts:
- Am I using a “Reproducible” or an Adhoc approach?
- Given an analytics question (and perhaps an answer), can I test for robustness? Longitudinal accuracy? Bias? etc.
- How can I better balance business needs with reqest efforts (meta work)?
3rd order concepts:
- Given an analytics solution, how can I optimize it? Scale it? Orchestrate it? Automate it?
- Whole “data stack” perspective (What are all the components of my stack and what components add friction and where?)
- How can I lead my organization into a more data centric reality? (Resources? Education?)
Skills / Tech #
1st order skills (manipulate some amount of data directly):
- basic data sources (flat files, binary datasets)
- excel tables, basic pivots, formulas etc.
- tabling, pivoting, graphing etc.
- a point and click analytics tool
- select / query from a database
2nd order skills (data manipulation or analysis leverage):
- advanced data sources (nested json data, paginated apis, etc.)
- advanced excel knowledge (power query, power pivot, dax measures etc.)
- complex database operations (triggers, stored procs)
- an etl platform
- an analytics platform
- a version control system
- a custom code analytics
3rd order skills (meta-manipulation, scaled):
- profound database skillset (distributed, hdfs at scale)
- an orchestration platform
- software engineering skills (api / pubsub / )
- advanced mathematical / statistics
1st order network (1-1 network development):
- Connet with peers, alumni, or cold connect analytics professional directly.
- Tutor, manage, or mentor others into being better within the analytics professionals.
- Understand who among your coworkers or former coworkers can I offer additional value to directly?
- Who in my network would I feel comfortable asking for a referral from?
- What Meetup / Slack group exists for analytics professionals in my city / region?
2nd order network (“Always on” resource development):
- What kind of blog / website can I maintain about myself and my projects?
- What community can I become a contributor in? (stackoverflow, discourse, kaggle)
- What contributions can I make to open source projects in my area? (data visualization, ML, etc.)
3rd order network (Improve the domain itself):
- What community does not yet exist that I can create?
- What are the bad practices that are generally perpetuated and how can they be remediated?