I did a thing...
Over the summer, I met infrequently a couple of times with folks who were interested in doing some “Existing Data” or “Secondary Data” research studies. These studies use an existing data set that has been used for another stated purpose, and develop some research questions that may be answerable using the existing data. Many times, these data sets are longitudinal and ask a WHOLE LOT of questions about a respondent’s conditions and life circumstances in addition to the ones that were part of the original study.
The data was collected and has been warehoused. Much of the time, the data is freely available, as from a federal survey or the like. So, a fertile ground for re-purposing.
I know not much about the big data manipulation (R, SQL, and the like), so I watched some webinars in how to use those tools for Big Data analysis. And some of those were useful. I felt like I needed some more formal (rather than autodidactic) training / preparation. I found Coursera’s Google Data Analytics certificate that includes material in SQL, BigQuery, R, and additional mastery work in Excel formulas for large spreadsheets, and visualizations through Tableau. This is not an ad for Coursera, yet the course was well-paced and suitable for beginners to advanced beginners (like me, because I had used the tools before).
Anyway, there were eight courses in the certificate, with the final one a capstone experience; creating a case study for analysis and interpretation much like a junior data analyst would do. For those interested in seeing my case study, check out this website. I posted it privately because I was uncertain of the depth of my case analysis. Looking at others on Kaggle, I may not need to feel so uncertain…
Anyhow, I finished and got the certificate above! Now, I feel like I can go back to the webinar I swam in over the summer and see about digging into some of that data. This time, with some better analytical tool experience behind me.