Open Data Science for Smart Manufacturing

Open Data offers a tremendous opportunity in transformation of today’s manufacturing sector to smarter manufacturing. Smart Manufacturing initiatives include digitalising production processes and integrating IoT technologies for connecting machines to collect data for analysis and visualisation.

In this talk, an understanding of linkage between various industries within manufacturing sector through lens of Open Data Science will be illustrated. The data on manufacturing sector companies, company profiles, officers and financials will be scraped from UK Open Data API’s.

Typical task includes data preprocessing, network analysis for industries, clustering and deploying the model as an API using Google Cloud Platform. The presenter will discuss about the necessity of 'Analytical Thinking' approach as an aid to handle complex big data projects and how to overcome challenges while working with real-life data science projects.

1 favorite thumb_down thumb_up 1 comment visibility_off  Remove from Watchlist visibility  Add to Watchlist

Outline/Structure of the Talk

Data Scraping from UK Open Data API's

Data Preprocessing and Integration

'Analytical Thinking' for handling financial data

Network Analysis for Industries

Deployment on Google Cloud Platform

Learning Outcome

  • This talk will provide an overview on how Open Data can be leveraged for businesses using data science technologies.
  • The talk will emphasis on how to build scalable data science pipelines with ease, even if attendees have no prior experience in data science.
  • The attendees will understand the importance of all stages of data science projects like data extraction, data cleaning, data integration, feature engineering, data analysis and model deployment.

Target Audience

Statisticians, Business Professionals interested in Data Science Applications, Data Science Enthusiasts and Data Engineers

Prerequisites for Attendees

  • Basic knowledge of Python
  • Basic Knowledge of Network Analysis
schedule Submitted 1 month ago

Public Feedback

comment Suggest improvements to the Speaker
  • Anoop Kulkarni
    By Anoop Kulkarni  ~  1 week ago
    reply Reply

    Neha, thank you for your proposal. Sounds very interesting. When you mention web scraping UK databases, would this talk be more at a concept level or handson tutorial and/or demo level?

    Looking forward.. Can you suggest some time breakup for your talk?



  • Liked Anupam Purwar

    Anupam Purwar - Prediction of Wilful Default using Machine Learning

    45 Mins
    Case Study

    Banks and financial institutes in India over the last few years have increasingly faced defaults by corporates. In fact, NBFC stocks have suffered huge losses in recent times. It has triggered a contagion which spilled over to other financial stocks too and adversely affected benchmark indices resulting in short term bearishness. This makes it imperative to investigate ways to prevent rather than cure such situations. However, the banks face a twin-faced challenge in terms of identifying the probable wilful defaulters from the rest and moral hazard among the bank employees who are many a time found to be acting on behest of promoters of defaulting firms. The first challenge is aggravated by the fact that due diligence of firms before the extension of loan is a time-consuming process and the second challenge hints at the need for placement of automated safeguards to reduce mal-practises originating out of the human behaviour. To address these challenges, the automation of loan sanctioning process is a possible solution. Hence, we identified important firmographic variables viz. financial ratios and their historic patterns by looking at the firms listed as dirty dozen by Reserve Bank of India. Next, we used k-means clustering to segment these firms and label them into various categories viz. normal, distressed defaulter and wilful defaulter. Besides, we utilized text and sentiment analysis to analyze the annual reports of all BSE and NSE listed firms over the last 10 years. From this, we identified word tags which resonate well with the occurrence of default and are indicators of financial performance of these firms. A rigorous analysis of these word tags (anagrams, bi-grams and co-located words) over a period of 10 years for more than 100 firms indicate the existence of a relation between frequency of word tags and firm default. Lift estimation of firmographic financial ratios namely Altman Z score and frequency of word tags for the first time uncovers the importance of text analysis in predicting financial performance of firms and their default. Our investigation also reveals the possibility of using neural networks as a predictor of firm default. Interestingly, the neural network developed by us utilizes the power of open source machine learning libraries and throws open possibilities of deploying such a neural network model by banks with a small one-time investment. In short, our work demonstrates the ability of machine learning in addressing challenges related to prevention of wilful default. We envisage that the implementation of neural network based prediction models and text analysis of firm-specific financial reports could help financial industry save millions in recovery and restructuring of loans.