Detection and Classification of Fake news using Convolutional Neural networks

schedule Aug 31st 02:55 PM - 03:15 PM place Grand Ball Room 1 people 108 Interested

The proliferation of fake news or rumours in traditional news media sites, social media, feeds, and blogs have made it extremely difficult and challenging to trust any news in day to day life. There are wide implications of false information on both individuals and society. Even though humans can identify and classify fake news through heuristics, common sense and analysis there is a huge demand for an automated computational approach to achieve scalability and reliability. This talk explains how Neural probabilistic models using deep learning techniques are used to classify and detect fake news.

This talk will start with an introduction to Deep learning, Tensor flow(Google's Deep learning framework), Dense vectors (word2vec model) feature extraction, data preprocessing techniques, feature selection, PCA and move on to explain how a scalable machine learning architecture for fake news detection can be built.

 
5 favorite thumb_down thumb_up 9 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/structure of the Session

The outline would be in the following order:

  • Why identification of fake news is relevant in today's biased world.
  • Showcasing a neural network architecture(CNN) built to solve the problem at scale.
  • Compare with other state of art techniques developed for this problem.
  • Challenges faced identifying fake articles through Machine learning methodologies.

Learning Outcome

  • Understand the role of Deep learning models in Text mining and classification
  • Build scalable architecture in machine learning applications and deploy it Live

Target Audience

Individuals interested in NLP, Text mining,Data mining and Deep learning approaches for text classification

Prerequisite

Understanding of Deep learning, Convolutional neural networks and Text classification.

schedule Submitted 6 months ago

Comments Subscribe to Comments

comment Comment on this Submission
  • Naresh Jain
    By Naresh Jain  ~  5 months ago
    reply Reply

    Can you please share a video link from your past presentation and/or links to articles you've written which will help the program committee appreciate your expertise on this topic?

    • Venkatraman J
      By Venkatraman J  ~  5 months ago
      reply Reply

      This is my project research work at the moment and i have written blog as i am yet to submit project report which is due in august middle. The college plagarism software will detect any words used in report used else where

      But i have started blogging about AI and machine learning in dzone since last month.

      https://dzone.com/articles/demystifying-ai-and-machine-learningpart1

      https://dzone.com/articles/demystifying-ai-and-machine-learning-part-2

      I am going to write blogs about Text classification, NLP using Machine learning, and about this project too soon.

      • Sarah Masud
        By Sarah Masud  ~  4 months ago
        reply Reply

        Hey Venkat,

        As your submission is around mid August, and the conferenc is around August end, will you able to share your work publically by the conference(in terms of paper, blog or code)?

        It is always better if audience has some future point of reference, if they want to understand your topic in detail post conference

        • Venkatraman J
          By Venkatraman J  ~  4 months ago
          reply Reply

           Sarah Masud  Thanks for your comment. I will add in my slide the accuracy, precision of the results  and of course my blog too. I will be able to show my code, paper and rest only after my university has evaluated my project and has scored it. I will give reference to the papers i read about it, related work.

      • Debdoot Mukherjee
        By Debdoot Mukherjee  ~  4 months ago
        reply Reply

        What kind of a dataset have you applied your technique? Curious to know whether the dataset is open or proprietary. It will help if you can talk about results that you may obtain with baseline models and compare them with what you get with your approach.

        • Venkatraman J
          By Venkatraman J  ~  4 months ago
          reply Reply

          Debdoot Mukherjee Thanks for the comment. Open data sets in Kaggle and other places are not up to date to use Machine learning techniques. Human beings have their own bias in mind while reading or telling an article as fake.

          I am creating my own data repository for my work by creating a clean data set from scratch using Web techniques. Sure i will have a slide to do comparison of my results with State of the art work done on this problem already.

           

           

  • Naresh Jain
    By Naresh Jain  ~  5 months ago
    reply Reply

    This is a very relevant topic in today's world. Thanks for proposing this Venkat.

    I see 3 distinct sections:

    • Overview of the concepts and tools used - explaining the concepts, sharing different alternatives, giving a demo of the tools and so on.
    • Feature extraction and text classification using Neural nets - how to do it, demo, typical challenges, some insights on how you solved them, future direction and so on.
    • Build scalable architecture - design, develop and deploy

    Can you please explain, how you plan to distribute your 45 mins across these 3 sections? Do you think you would be able to go into enough depth on each section, share your first-hand insights such that the attendees have a clear takeaway from each section?

    Given that you only have 45 mins for this presentation, I feel you are trying to cover too much.

    If you agree, please update the proposal accordingly.

    • Venkatraman J
      By Venkatraman J  ~  5 months ago
      reply Reply

      Naresh Jain  I updated the outline to make it more crisp and explained the points that would be covered in my Presentation.

    • Venkatraman J
      By Venkatraman J  ~  5 months ago
      reply Reply

      Thanks for the comments Naresh. I agree 45 mins is very less cover the whole topic as it's a work which i have been doing for last three months.

      I am planning to show in the following order

      1.Problem being solved what and why

      2.Techniques followed to solve the problem(neural net approach with probabilistic models)

      3. Neural net Architecture used to solve the problem

      4. Accuracy/Precision of the solution.

      I will stay away from data acquisition, preparing the data as that will drag on to more than 45 minutes.

      I would be happy to discuss one on one with people about particular approach to solve the problem. 


  • Liked Dr. Dakshinamurthy V Kolluru
    keyboard_arrow_down

    Dr. Dakshinamurthy V Kolluru - ML and DL in Production: Differences and Similarities

    45 Mins
    Talk
    Beginner

    While architecting a data-based solution, one needs to approach the problem differently depending on the specific strategy being adopted. In traditional machine learning, the focus is mostly on feature engineering. In DL, the emphasis is shifting to tagging larger volumes of data with less focus on feature development. Similarly, synthetic data is a lot more useful in DL than ML. So, the data strategies can be significantly different. Both approaches require very similar approaches to the analysis of errors. But, in most development processes, those approaches are not followed leading to substantial delay in production times. Hyper parameter tuning for performance improvement requires different strategies between ML and DL solutions due to the longer training times of DL systems. Transfer learning is a very important aspect to evaluate in building any state of the art system whether ML or DL. The last but not the least is understanding the biases that the system is learning. Deeply non-linear models require special attention in this aspect as they can learn highly undesirable features.

    In our presentation, we will focus on all the above aspects with suitable examples and provide a framework for practitioners for building ML/DL applications.

  • Liked Dr. Manish Gupta
    keyboard_arrow_down

    Dr. Manish Gupta / Radhakrishnan G - Driving Intelligence from Credit Card Spend Data using Deep Learning

    45 Mins
    Talk
    Beginner

    Recently, we have heard success stories on how deep learning technologies are revolutionizing many industries. Deep Learning has proven huge success in some of the problems in unstructured data domains like image recognition; speech recognitions and natural language processing. However, there are limited gain has been shown in traditional structured data domains like BFSI. This talk would cover American Express’ exciting journey to explore deep learning technique to generate next set of data innovations by deriving intelligence from the data within its global, integrated network. Learn how using credit card spend data has helped improve credit and fraud decisions elevate the payment experience of millions of Card Members across the globe.

  • Liked Srijak Bhaumik
    keyboard_arrow_down

    Srijak Bhaumik - Let the Machine THINK for You

    20 Mins
    Demonstration
    Beginner

    Every organization is now focused on the business or customer data and trying hard to get actionable insights out of it. Most of them are either hiring data scientists or up-skilling their existing developers. However, they do understand the domain or business, relevant data and the impact, but, not essentially excellent in data science programming or cognitive computing. To bridge this gap, IBM brings Watson Machine Learning (WML), which is a service for creating, deploying, scoring and managing machine learning models. WML’s machine learning model creation, deployment, and management capabilities are key components of cognitive applications. The essential feature is the “self-learning” capabilities, personalized and customized for specific persona - may it be the executive or business leader, project manager, financial expert or sales advisor. WML makes the need of cognitive prediction easy with model flow capabilities, where machine learning and prediction can be applied easily with just a few clicks, and to work seamlessly without bunch of coding - for different personas to mark boundaries between developers, data scientists or business analysts. In this session, WML's capabilities would be demonstrated by taking a specific case study to solve real world business problem, along with challenges faced. To align with the developers' community, the architecture of this smart platform would be highlighted to help aspiring developers be aware of the design of a large-scale product.

  • Liked Dr. Veena Mendiratta
    keyboard_arrow_down

    Dr. Veena Mendiratta - Network Anomaly Detection and Root Cause Analysis

    45 Mins
    Talk
    Intermediate

    Modern telecommunication networks are complex, consist of several components, generate massive amounts of data in the form of logs (volume, velocity, variety), and are designed for high reliability as there is a customer expectation of always on network access. It can be difficult to detect network failures with typical KPIs as the problems may be subtle with mild symptoms (small degradation in performance). In this workshop on network anomaly detection we will present the application of multivariate unsupervised learning techniques for anomaly detection, and root cause analysis using finite state machines. Once anomalies are detected, the message patterns in the logs of the anomaly data are compared to those of the normal data to determine where the problems are occurring. Additionally, the error codes in the anomaly data are analyzed to better understand the underlying problems. The data preprocessing methodology and feature selection methods will also be presented to determine the minimum set of features that can provide information on the network state. The algorithms are developed and tested with data from a 4G network. The impact of applying such methods is the proactive detection and root cause analysis of network anomalies thereby improving network reliability and availability.

  • Liked Hariraj K
    keyboard_arrow_down

    Hariraj K - Big Data and Open data: as tools for empowering people

    Hariraj K
    Hariraj K
    Co-Founder
    Practica.ly
    schedule 6 months ago
    Sold Out!
    20 Mins
    Talk
    Beginner

    With limited transparency, governments tend to become less accessible to the public. While data science remains as a dominating market in almost all day-to-day life industries, its possibilities in administration and governance are yet to be exploited. In this presentation, I address how emerging concepts such as open data and big data can be used to strengthen democracies and help governments serve the public better. We will explore the various possible ways big data and open data can be used to bridge income inequalities and implement proper resource and service allocation. We will also be looking at different initiative taken by individuals and communities and see the impact those initiatives have had on aiding governance. We will also emphasize the concept of open governance and government open data.

  • Liked Hariraj K
    keyboard_arrow_down

    Hariraj K - Importing and cleaning data with R

    Hariraj K
    Hariraj K
    Co-Founder
    Practica.ly
    schedule 6 months ago
    Sold Out!
    45 Mins
    Workshop
    Intermediate

    We are experiencing a tremendous explosion in big data. A significant share of this data is unfit for direct analysis or machine learning. This presentation emphasizes on web scraping with powerful R packages such as httr and tools like XPath.This session will also introduce the principles of data cleaning. By the end of the session, you will be able to import raw data from most websites and transform them into proper robust datasets. In the due course of this session, we would build a robust dataset by implementing the above concepts ready for analysis

  • Liked Venkatraman J
    keyboard_arrow_down

    Venkatraman J - Hands on Data Science. Get hands dirty with real code!!!

    Venkatraman J
    Venkatraman J
    Sr. Software engineer
    Metapack
    schedule 5 months ago
    Sold Out!
    45 Mins
    Workshop
    Intermediate

    Data science refers to the science of extracting useful information from data. Knowledge discovery in data bases, data mining, Information extraction also closely match with data science. Supervised learning,Semi supervised learning,Un supervised learning methodologies are out of Academia and penetrated deep into the industry leading to actionable insights, dashboard driven development, data driven reasoning and so on. Data science has been the buzzword for last few years in industry with only a handful of data scientists around the world. The industry needs more and more data scientists in future to solve problems using statistical techniques. The exponential availability of unstructured data from the web has thrown huge challenges to data scientists to exploit them before driving conclusions.

    Now that's overload of information and buzzwords. It all has to start somewhere? Where and how to start? How to get hands dirty rather than just reading books and blogs. Is it really science or just code?. Let's get into code to talk data science.

    In this workshop i will show the tools required to do real data science rather than just reading by building real models using Deep neural networks and show live demo of the same. Also share some of the key data science techniques every aspiring data scientist should have to thrive in the industry.

  • Liked Hariraj K
    keyboard_arrow_down

    Hariraj K - Reccomendation engine: Theory and mathematical implementation

    Hariraj K
    Hariraj K
    Co-Founder
    Practica.ly
    schedule 5 months ago
    Sold Out!
    10 Mins
    Talk
    Beginner

    From our Tinder matches to movies we watch on Netflix, we tend to encounter recommendation engines on a day to day basis and with the data explosion in place, the number of recommendation engines at play would increase dramatically. In this talk, we look into the underlying principles of recommendation engines. You will learn about the main types of recommendation engine approaches. By the end of this session, you will have ideas on how each of this approaches can be implemented. You will also be able to understand the pros and cons of both these approaches.