Harshil Darji

Harshil Darji

Data Scientist & AI Researcher

I am a data scientist working at the intersection of natural language processing and law. My research focuses on building and evaluating models for Legal Named Entity Recognition and Relation Extraction, as well as constructing citation networks and structured datasets for German legal texts. I am particularly interested in how domain-adapted language models can support reliable, privacy-compliant processing of court decisions and legal documents.

Beyond model development, I work on turning research prototypes into usable systems, including anonymization pipelines, similarity search, and graph-based analysis of legal case relationships. My goal is to combine careful data modeling with robust machine learning to make complex legal texts more accessible and analyzable at scale.

Dec 2024 – Present

Data Scientist / KI-Experte

Hochschule für Technik und Wirtschaft Berlin, Berlin, Germany
  • Developed a citation network of German legal cases using Neo4j, allowing for efficient cross-referencing and analysis of case relationships.
  • Enhanced the previous Legal NER LLM to automate the anonymization of German legal documents, ensuring compliance with privacy regulations.
  • Implemented a legal text similarity search as a foundation for a RAG-based system to support more accurate, context-aware legal research.
Sep 2021 – Sep 2024

Data Science Researcher

University of Passau, Passau, Germany
  • Fine-tuned a German BERT model on the legal dataset, achieving an F1 score of 99.49%, and compiled a dataset of 2944 German legal references.
  • Co-developed a GDPR-compliant dataset of 44 privacy policies with 33 entity types, enhancing Named Entity Recognition (NER) and Relation Extraction (RE) capabilities for NLP models.
  • Independently fine-tuned and published LLMs for NER and RE in privacy policies, achieving F1 scores of 74% and 83%, respectively.
Oct 2016 – Jan 2017

Junior Android Developer

Profero Techno Pvt. Ltd., Mumbai, India
  • Developed an Android application to help local businesses promote sales with hourly discounts, integrating Firebase for the real-time database.
  • Wrote REST web services to support Android and iOS versions and built a business interface for managing discounts independently.
2018 – 2021

MS Computer Science

University of Passau, Passau, Germany
2012 – 2016

BS Information Technology

Gujarat Technological University, Gujarat, India
Thesis: Steganography

Models