Brushing Up on Data Science Skills: A Journey of Rediscovery

As part of a recent job interview process, I embarked on a journey to refresh my technical skills in Data Science, particularly to prepare for the technical portion of the interview. While I ultimately didn’t secure the position, the experience was invaluable, providing me with a renewed appreciation for core data science concepts and new insights into the tools and strategies that make this field so fascinating.

Exploring Data Engineering Techniques

One of the key areas I revisited was the process of combining data from multiple tables—a cornerstone of data science workflows. I delved deep into ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) methodologies, each offering distinct advantages and trade-offs:

  • ETL follows a structured approach where data is extracted, transformed to fit the target system’s requirements, and then loaded into a data warehouse. It’s systematic and reliable but lacks the flexibility for real-time data handling and can be resource-intensive if transformations need to be redone for different teams.
  • ELT, in contrast, skips intermediate transformations, loading raw data into the target system first and then transforming it as needed. This allows for faster, real-time data handling and reduced costs. However, data security can pose challenges in this approach.

Understanding the nuances of data extraction was equally enlightening. Techniques like incremental extractions and partial extractions with update notifications stood out as powerful tools for real-time and efficient data movement.

The transformation phase also revealed its intricacies, emphasizing the importance of rigorous testing and data cleaning processes to ensure smooth integration. Learning to identify and resolve inconsistencies, deduplicate records, and design schemas that align with business needs was a fulfilling exercise.

Finally, I revisited the importance of incremental loading as the preferred method for efficient data handling, balancing performance with cost-effectiveness.

Regular Expressions: A New Frontier (for me)

While data integration was a refresher, exploring regular expressions felt like venturing into new territory. This powerful tool for pattern matching and text processing opened up countless possibilities, from validating data formats to efficiently parsing and analyzing strings. I learned to apply concepts like character classes, anchors, groups, and lookarounds, deepening my understanding of how to manipulate and query text-based data. I found the table below to be especially helpful (reference: https://regexr.com/)

Character classes
.any character except newline
\w\d\sword, digit, whitespace
\W\D\Snot word, digit, whitespace
[abc]any of a, b, or c
[^abc]not a, b, or c
[a-g]character between a & g
Anchors
^abc$start / end of the string
\b\Bword, not-word boundary
Escaped characters
\.\*\\escaped special characters
\t\n\rtab, linefeed, carriage return
Groups & Lookaround
(abc)capture group
\1backreference to group #1
(?:abc)non-capturing group
(?=abc)positive lookahead
(?!abc)negative lookahead
Quantifiers & Alternation
a*a+a?0 or more, 1 or more, 0 or 1
a{5}a{2,}exactly five, two or more
a{1,3}between one & three
a+?a{2,}?match as few as possible
abcd

Derivatives and Their Applications

A brief detour into derivatives reminded me of the mathematical foundations underpinning data science. Revisiting these concepts not only refreshed my skills but also highlighted their relevance in areas like optimization, gradient descent, and predictive modeling.

The Value of Revisiting Fundamentals

Although I didn’t land the position, this preparation reaffirmed the importance of staying connected with foundational skills. It’s easy to become distanced from academic concepts after entering the workforce, but they remain critical for technical interviews and practical problem-solving.

I also discovered how much data science has evolved since I graduated, with newer methodologies and tools now integral to the field. The experience rekindled my passion for continuous learning, and I’m grateful for the opportunity to refine my expertise and add new skills to my toolkit.

In the end, while the outcome wasn’t what I hoped for, the journey itself was rewarding—reminding me that the path of learning is as valuable as the destination.

Leave a Reply

Your email address will not be published. Required fields are marked *