Data Scientist Spotlight: Mariana Almeida

This week, we are speaking with Mariana Almeida, who is working as a volunteer data scientist at a Brazilian non-profit organization called Health Lake. Health Lake is democratizing access to good quality data on healthcare in Brazil to help the government and institutions to take quicker actions in regards to disease control.

Mariana is a datapane super-user who has published more than 28 reports in datapane.

Let’s get to know more about Mariana!

Interview

Q: Tell us about yourself!

Mariana: I am a chemical engineer, working in the healthcare business but I always loved programming. When I am not working/learning data science I am climbing, trekking long distances, and doing volunteer work (well, this was the reality before the pandemic). Recently, I took the extra free time, working from home gave me, to learn about python and data science. I learned everything I know by myself, searching through the internet, taking online courses, and reading python libraries documentation.

Nowadays, I am a volunteer data scientist at a Brazilian non-profit organization called Health Lake.

Q: Why do you prefer to visualize data using Python instead of a proprietary tool, like Tableau?

Mariana: I learned how to use tableau just because some job descriptions ask for it, but to tell the truth, I prefer to build things for myself preferably using open-source software. I like the freedom of working with open-source software. Besides, there is a democratic side to this: using python means that everyone can build things for themselves and contribute somehow to a greater community.

Q: How do you like to contribute to the data community?

Mariana: I am still learning but I am a fast learner. I hope to soon be able to help other people start learning about programming and data science from scratch. At the moment, I am using the knowledge I have gained so far to work on a non-profit organization project.

Q: What tips and resources would you have for someone looking to learn about Python data visualization?

Mariana: I think it is great taking courses to have an overview of the python language and libraries, but nothing is better than learning by doing. When I first got interested in programming, in the ’90s, I did not have access to the internet all the time so I had to learn a lot from offline resources (like books on Pascal language lol). Nowadays it is so easy to start a project and getting help from people in communities, in resources such as Stackoverflow, slack communities, etc. So I would say that you should choose a problem you want to solve or a piece of information you would like to analyze and start a python visualization project on it.

Q: What do you think are the best ways to get noticed as someone telling stories with data?

Mariana: I am trying to specialize in telling good stories with data and what I have learned so far is that you must learn to ask good questions and answer them with visualization. It is really important to learn how to communicate in a way that is understandable by everyone. One of the things that make data science so fundamental in a business is the power to solve problems. I already noticed an advancement in my current job just because I learned how to summarize data in visualizations instead of just explaining by text in PowerPoint presentations.

Q: What tips would you have for someone just starting out?

Mariana: Choose a topic you are interested to learn more about or a problem you would like to solve and dive deep into it using python. If you know nothing about python, I would suggest taking the python and data science courses available on freecodecamp. I started from scratch there and moved on to start data projects on Kaggle, another great resource. I read a lot of notebooks from data scientists in Kaggle to learn how a ‘real data scientist’ analyzes data.

Q: What are your favorite libraries and resources for creating visualizations and data stories?

Mariana: I love the Plotly library. I prefer dynamic visualizations over static ones, so the Plotly library is perfect. Using Datapane is even better because I can create reports and include the plots on my blog posts. For static visualizations for printed reports and presentations, I use the seaborn library.

Check out Mariana's latest work

How to connect with Mariana

Congrats Mariana on a much-deserved spotlight and thank you for sharing your thoughts and experiences with us. We look forward to seeing more of your amazing reports in the future!

Get great content updates from our team to your inbox.

Join 6,000 Data Scientist subscribers. GDPR compliant.