Function/Role: Senior Software Engineer
Location: Hopkinton, Massachusetts
Area of Expertise: Big Data Analytics, Data Analytics, Data Engineering, Emerging Technologies, Enterprise Data Warehousing
Education: Master of Science in Information Systems from Northeastern University, Boston MA
LinkedIn: Neha Sharma
What’s your background? Have you always been interested in technology? What drew you to work in this field?
For the past 8 years, I have worked with various data warehousing and business Intelligence tools and technologies. I have always enjoyed working with data, and my career has provided me with many good opportunities to do just that, under different roles such as Software Engineer, Data Analyst, ETL developer, Report Developer, and Business analyst. I believe good data management and data analysis practices are key to any project’s success.
I became really interested in the big data analytics field while studying at Northeastern University. I worked on a big data project analyzing worldwide disaster data from 1900 to 2008. As part of this project, I focused on analyzing various factors such as demographics and geography and how each contributed to natural calamities in different parts of the world. Though this is just an example of what I worked on, it helped me realize the potential of big data analytics, and how we can make it work to do things for us.
What most interests you about your work? What is your current job like?
I am currently working as a Senior Software Engineer at EMC under the EOS2 reporting services organization. My responsibilities include data engineering activities such as data munging, creating data ingestion pipelines, and installation and configuration of data enrichment tools such as Alteryx, R Shiny Server, and Tableau. I also provide support with data analytics by leveraging R, in-database analytics, and other analytics tools to analyze data in support of an enterprise data warehouse ecosystem that supports both traditional and advanced analytics for over 1500 users.
There is so much data all around us and I believe that most, if not all, of our questions can be answered with this data. Using the various tools of analytics, it is just a matter of asking the right questions of the data. The thing that I like best about my job is that I get to work on different projects using data that is mined from more than 30 operational systems. I get to use and apply the latest tools and technologies to mine data and generate information that help EMC achieve high value, successful business outcomes.
Are you active in the EMC Proven Professional Community or the ECN?
I am not very active on ECN. I do plan to get more involved in the future, as it’s always beneficial to share and seek new ideas.
Can you describe the value to you of your EMC Proven Professional certifications? Why did you choose to get certified? How has the EMC Proven Professional Program benefitted you professionally?
Having worked on different data analysis projects, I’ve learned that it is very important to use the right analytical methods when approaching a problem. The EMC Proven Professional Data Scientist track provides a thorough education in the disciplines of big data and data science.
Each of these courses deliver the right balance between theory and hands-on learning with emerging technologies such as the Hadoop eco-system of tools, Greenplum MPP database, R, and Gephi. The curriculum provides ample hands-on experience with the R programming language, Java, and Python, all of which are key tools in the data scientist’s tool bag.
I have taken each of these courses and they have greatly enhanced my knowledge of big data, analytics methodologies, and all the tools used in these disciplines. Also, the certification validates your ability to apply the techniques and tools required for Big Data and Data Science.
You are the first woman to sit for the exam and earn the Data Science Advanced Analytics Specialist certification. What was it like taking this course and preparing for the exam?
I took an instructor led, five-day classroom advanced analytics course. The Advanced Analytics course builds on the Associate-level course and provides in-depth information about the latest analytics tools and technologies. This course extensively covered text analytics (natural language processing) and social network analysis, in addition to covering the latest big data analytics tools.
This course was a great learning experience for me. During training I got to interact with many other data science professionals and learn about the different projects and initiatives that they were working on. The class was very interactive and our instructor encouraged us to work collaboratively to solve sample analytical problems that were included in the course.
Taking this course really helped prepare me for the exam. Along with this, I also studied the student guide provided with the course. The student guide is well organized and covers each module in detail.
Have you written any Knowledge Sharing articles? Do you plan to?
I have written papers related to big data analytics during my Master’s Degree projects. I am also a guest blogger at www.datatechblog.com where I write on big data topics. I plan to write more knowledge sharing articles related to data analytics.
What big project are you working on now? Or what major project have you recently completed?
Currently, I am working on a project that analyzes “click-stream” data for a large user community within EMC. Called “The Hub,” this portal provides a central access point for reporting, services, collaboration, and data-driven applications used to support EMC’s mid-market storage division. Analyzing “The Hub” provides valuable data that tells us what types of actions users are taking, what the most popular widgets are, and how the portal is being used as a function of a job role. All of this data allows us to use advanced analytics to improve overall user experience. For this project, I researched and implemented a database solution that is perfect for click-stream data and time series analysis. I also built visualizations and dashboards using Tableau for reporting and analyzing user’s clickstream data.
Where do you see yourself in 5 years? Are there any new technologies/skills you want to learn?
My plan is to continue my big data and advanced analytics journey. I am really interested in text analytics, sentiment analysis, and social network analysis and have taken to self-study to further enhance my skills in these disciplines. From the technology perspective, I have begun a deep dive of NoSQL technologies with a focus on graph databases. In five years I hope to be working either as a data scientist or data engineer, but given the rapid rate at which technology is changing, I may find myself working on the next greatest thing.
Name a major achievement in your career or any awards that you have won.
I was awarded Master Valedictorian by Northeastern University and was the only one in my program to achieve it. I have also received Excellence@EMC awards for my contributions in helping to make EMC an industry leader and a Great Place to Work.
You work with an analytics team that supports a large part of EMC’s Core Technology Division, and sometimes collaborate with members of the EMC Big Data team. Big Data is a rapidly changing and expanding field. What are you predictions for the future of Big Data and Data Science?
I believe the future of big data is very bright. The amount of information and data that we have available today is far more than we have ever had before. There are many open source tools available today such as Hadoop, Hive, HBase, and Mahout that not only enable cost-effective storage of data, but also allows us to use high performance data analytics on large data sets.
These new NoSQL tools allow easy processing of data available in almost any form as opposed to traditional relational databases and warehousing technologies that limit data analysis to data available only in certain allowable formats.
Today, more and more organizations are leveraging big data analytics tools and data science techniques to enable more effective business planning that can drive business transformation. I believe it’s a very cutting-edge field, with lots to explore, and it will be a major part of how we plan and do things in the future.
What advice do you have for someone considering a career in Data Science?
Data science is an exciting field that is a lot of fun and offers a promising career. It is an exploratory and continuously evolving field, so it is important for anyone considering a career in this field to focus on continuous learning and education to keep up with all the new advancements in this area.
Read other EMC Proven Professional Spotlight interviews in the Proven Professional Spotlight Archive