At Data Science Croatia meetup, we discussed the unique technical and data challenges in developing and maintaining models for the identification of personal data across enterprise structured and unstructured data sources.
Ana-Marija specifically focused on technical parts and the challenges in training ML models when the objective is to identify personal data – which by definition, are not readily and legally available. We talked about heterogenous systems and how to create a learning context for tables and bullet point documents.
Among other topics, Ana-Marija discusses the unique technical and data challenges in developing and maintaining models for the identification of personal data across structured and unstructured data sources that we have developed and implemented for the data discovery module of Data Privacy Manager.
Speaker
Ana-Marija Petric completed her Masters in Linguistics at the University of Oxford. She worked in the Netherlands for several years focusing on AI and machine learning in different industries. She has always been interested in Language and personal data, which perfectly aligns with her current position as Head of Research Legit Software – the company behind Data Privacy Manager, which was recognized both by Gartner and included in Forrester Wave.
She is passionate about access to compliance democratization and focuses in particular on personal data identification across different languages and different scripts.
When not working, she learns Chinese, travels, plays video games, reads sci-fi books, and follows new advances in space exploration and astronomy.