Combining Data, Tech and Social Science to understand the Indian Judiciary
The judicial system in India is an interconnected web of court complexes and establishments. It's a multi-tier system of 674 District courts, 25 State High courts, and the Supreme Court - all working together to bring justice to the 1.3BN citizens of this country. Information around new case registrations, pending and disposed case details creates a massive data pool of legal data
Data management, standardization and accessibility are some huge challenges and rarely there are cases where people cite these platforms for conducting important legal research on important topics of Case pendency and Case-law analysis, etc. This coupled with the stories of judicial corruption in the media has fueled a low level of trust in the Collegium. Current research is highly fragmented and is powered by data provided by some closed source tools which makes it extremely difficult to validate and conduct reproducible research
We envisage an ‘Open Judicial Data Platform’ that makes it easy for researchers to get access to a range of information - making it possible to research about the oldest case while still accessing the latest court judgements - and takes the burden of data cleaning off their shoulders, thereby ensuring that they spend their time building the narrative. By building data tools on top of this data, we close the information loop by making it easier to digest these research pieces by other stakeholders, eventually increasing their scope to participate in the legal process.
I would like to share some insights:
- On the process of creating this platform with our partners including legal researchers, lawyers and data scientists
- overcoming the barrier of understanding the legal space
- handling data and tech challenges using open tools and frictionless data packages
- and making the platform available to a diverse set of user classes.
As one of the use cases of the platform, I would also like to demonstrate a case study where we used open source entity recognition tools such as Spacy on the text of legal judgements to understand the juvenile justice activity in the country
Outline/Structure of the Case Study
- How judiciary works in the country
- The official E-courts data platform
- Challenges faced by Indian judiciary
- How data (ML/AI) can help solve some of these challenges
- Challenges faced by the legal tech ecosystem
- Co-Creating the Open Judicial Data Platform
- How we chalked out a path to understand the legal space with our partners
- Observations from the PAN India scoping study on lower courts and high courts
- Designing the data pipeline (Data Architecture)
- Citizen participation
- Creating Data tools ( NLP / ML / AI)
- Reproducible legal research
- The way ahead (How we plan to involve the community)
- Case Study (Demo)
- Juvenile justice in the country
- Analyzing the case-laws from POCSO (Protection of children from Sexual Offences) Act, 2012
- Using Natural language processing tools such as spacy to understand the legal judgements
- How can data science and social science be used to understand the low conviction rates in a POCSO case
- Juvenile justice in the country
- Understand the technicalities and the challenges behind a massive pool of judicial data that is growing daily at an exponential rate
- The philosophy of co-creation - How data scientists can partner with experts to co-create important data science tools
- Community contribution - How we as citizens can contribute to this legal-data cum tech ecosystem
- Our tech cum data stack and strategies to create a near real time data platform
- Natural language processing use cases - How can spacy be used out of the box and be tuned to context specific use cases like legal tech.
Data Scientists, Civic Tech enthusiasts, NLP Practitioners, Legal Tech Researchers, HCI enthusiasts
schedule Submitted 4 days ago
People who liked this proposal, also liked:
Gaurav Godhwani / Swati Jaiswal - Fantastic Indian Open Datasets and Where to Find ThemGaurav GodhwaniDirector & Co-founderCivicDataLabSwati JaiswalCivicHackerCivicDataLab
schedule 2 days agoSold Out!
With the big boom in Data Science and Analytics Industry in India, a lot of data scientists are keen on learning a variety of learning algorithms and data manipulation techniques. At the same time, there is this growing interest among data scientists to give back to the society, harness their acquired skills and help fix some of the major burning problems in the nation. But how does one go about finding meaningful datasets connecting to societal problems and plan data-for-good projects? This session will summarize our experience of working in Data-for-Good sector in last 5 years, sharing few interesting datasets and associated use-cases of employing machine learning and artificial intelligence in social sector. Indian social sector is replete with good volume of open data on attributes like annotated images, geospatial information, time-series, Indic languages, Satellite Imagery, etc. We will dive into understanding journey of a Data-for-Good project, getting essential open datasets and understand insights from certain data projects in development sector. Lastly, we will explore how we can work with various communities and scale our algorithmic experiments in meaningful contributions.
Akash Tandon - Traversing the graph computing and database ecosystemAkash TandonData EngineerSocialCops
schedule 1 week agoSold Out!
Graphs have long held a special place in computer science’s history (and codebases). We're seeing the advent of a new wave of the information age; an age that is characterized by great emphasis on linked data. Hence, graph computing and databases have risen to prominence rapidly over the last few years. Be it enterprise knowledge graphs, fraud detection or graph-based social media analytics, there are a great number of potential applications.
To reap the benefits of graph databases and computing, one needs to understand the basics as well as current technical landscape and offerings. Equally important is to understand if a graph-based approach suits your problem.
These realizations are a result of my involvement in an effort to build an enterprise knowledge graph platform. I also believe that graph computing is more than a niche technology and has potential for organizations of varying scale.
Now, I want to share my learning with you.
This talk will touch upon the above points with the general premise being that data structured as graph(s) can lead to improved data workflows.
During our journey, you will learn fundamentals of graph technology and witness a live demo using Neo4j, a popular property graph database. We will walk through a day in the life of data workers (engineers, scientists, analysts), the challenges that they face and how graph-based approaches result in elegant solutions.
We'll end our journey with a peek into the current graph ecosystem and high-level concepts that need to be kept in mind while adopting an offering.
Deepthi Chand / Shreya Agrawal - Samantar, an open assistive translation framework for Indic LanguagesDeepthi ChandDirectorCivicDataLabShreya AgrawalProject LeadCivicdatalab
schedule 2 days agoSold Out!
India is a land of many languages. There are 23 official and much more unofficial languages prevalently used in day-to-day conversations. Unfortunately, information dissemination to the low resource languages get difficult because of the geo-spatial distances. Popular translation platforms helped to fill this gap in major languages but their efficiency is challenged by the lack of availability of proper datasets and their generic nature. This problem is very evident when more domain information gets involved.
We present Samantar, an open translation suggestion framework targeted at Indian languages. Samantar is built with open parallel corpora and opensource technologies. The translations can be tuned to suggest according to different target domains.