Lead AI/ML Engineer
Member since 2 years
Aditya is currently working as Lead AI/ML Engineer at West Pharmaceuticals and previously worked in Microsoft as a Cloud Platform Developer. He is experienced in domains such as Machine Learning, Deep Learning, Internet of Things (IoT), Robotics and Cloud Computing. Currently, Aditya is working on application of Computer Vision in Manufacturing and Quality inspection.
Along with Computer Vision, Aditya has relevant experience related to Time-Series, NLP, Speech analysis. He is enthusiastic about teaching, mentoring and active community participation. He is also an AI Researcher working for a non-profit organization called MUST Research.
Aditya is a long time meber of MUST Research Club. He has attended and delivered talks on AI domain in many conferences. ODSC India 2019, Indo Data Week 2019, NIT Silchar ML Hackathon are the prominent ones. Aditya is also one of the faculty members at MUST Research Academy: https://academy.must.co.in/
Phone: +91 9962037759
Application of Masked RCNN for segmentation of brain haemorrhage from Computed Tomography Images
Automated analysis of CT scan images using AI solutions to diagnose abnormalities will help in overcoming the costly, time consuming and prone to error from manual analysis. Deep Learning has proved to be quite efficient to mimic human cognitive abilities (and even exceed that in many cases), especially with unstructured data.
DL algorithms can detect, localize and quantify a growing list of brain pathologies including intra-cerebral bleeds and their subtypes, infarcts, mass effect, midline shift, and cranial fractures. So, with advanced DL algorithms, analysis of radiographic data can be easily achieved and this can accelerate early detection of certain critical medical conditions, powered by AI.
As mentioned, Deep Learning algorithms for computer vision use cases has been extremely successful for classification and localization related problems. With the availability of annotated dataset, object of interest or region of interest segmentation using Deep Learning has been plausible.
Algorithms like Regional Convolutional Neural Network (RCNN) and it’s evolved forms, Faster RCNN and Masked RCNN is being widely used in the field of advanced radiology to auto detect medical conditions through radio-graphic images.
For this session, I am particularly going to talk about application of Masked RCNN for detection of regions of brain haemorrhage from CT scan images of the brain.
Enterprise DL - Accelerating Deep Learning Solutions to Production
Advanced deep learning approaches have been quite successful to solve real world problems related to unstructured data like images, audio-signals and texts. But how often are we able to make best use of these solutions by exercising their benefits at scale? Even when the deep neural network based solutions have made a significant impact in the world of AI and data science, but the key challenge which most organizations face, is bringing their solutions to production. As a matter of fact, until and unless these remarkable but yet complex algorithms are operated at scale, considering production and if these doesn’t have the capability to be integrated with various software applications, these solutions will never be really impactful.
So, for this talk, I will be discussing about various approaches to accelerate deep learning solutions from notebooks or research environment to production environment and how these solutions can be transformed as an enterprise level end to end Deep Learning Solution, which can be consumed as a service by any software application, with a practical use-case example.
Using Deep Learning to identify medical conditions related to Thorax Region from Radiographic X-Ray Images
Automated analysis of Chest X-ray images to diagnose various pathologies will help in overcoming the costly, time consuming and prone to error from manual analysis of them, especially using deep learning based approaches. One of such recent efforts in this direction is Classification of Common Thorax which combines the advantages of CNN based feature extraction and problem transformation methods in multi-label classification task.
So this is one of the key areas where deep learning based solution has already made an impact and has the potential to come up with even a better and well improved performance.
For this session, I am going to discuss about the problem at hand, the data-set, several approaches that has been explored and that worked quite well so far in this research. Also I am going to mention about the potential use case and the real world impact of such a real world healthcare application that can save millions of lives by early and effective detection.
Also I am going to mention about some of the key challenges faced during this research and how it can be scaled to build an end to end software solution!
Workshop: Introduction to Image Generation in Computer Vision using Deep Learningfavorite_border 1 odsc-india-2019 machine-learning-&-deep-learning Workshop 45 Mins Intermediate computer-vision hands-on-workshop hands-on-workshop-on-computer-vision introduction-to-computer-vision-using-deep-learning workshop:-introduction-to-image-generation-in-computer-vision-using-deep-learning
Workshop: Introduction to Image Generation in Computer Vision using Deep Learning
Impact of Data Science and Artificial Intelligence on Societal Growth
Impact of Data Science and Artificial Intelligence on Societal Growth
Machine generated animations for improving cognitive abilities of "special children"favorite_border 1 odsc-india-2019 machine-learning-&-deep-learning Talk 45 Mins Advanced machine-generated-animation machine-generated-animations-for-improving-cognitive-abilities-of-children-with-autism open-data-science improve-human-cognitive-abilities-with-ai gan lstm vae computer-vision machine-generated-animations-for-improving-cognitive-abilities-of-"special-children"
"Special children" includes children who are affected with a complex neuro-behavioral conditions like autism, which includes impairments in social interaction, language development and communication skills, combined with rigid, repetitive behaviors. Children with autism particularly face a very difficult childhood as they have extreme difficulty in communication. They have trouble in understanding what other people think and feel. This makes it very hard for them to express themselves either with words or through gestures.
Such special children need “special” care for the development of their cognitive abilities. The amount of learning resources required for teaching such children are extremely hard to find and less accessible to many.
So, can artificial intelligence with the help of modern deep learning algorithms generate animated videos for developing or improving cognitive abilities of such a special group?
The idea to combat the problem:
Well, I feel it can be done!
An animated video consists of 3 main components:
1. Graphical video (sequence of images put together to tell a story),
2. A background story and
3. A relevant background audio or music.
Now if we have to come up with a system that produces machine generated animated video, we would have to think about these three components:
- Machine generated sequence of images with a spatial coherence
- Machine generated text, or the story
- Machine generated audio or music, that highlights the mood or the theme of the video
If these three discrete components are put together in a cohesive flow, our purpose can be achieved. And the Deep Learning community has already been able to make significant progress in terms of machine generated images and audio and machine generated text.
Details about the three pillars of this problem:
Machine generated sequence of images with a spatial coherence
Generative Adversarial Networks (GANs) has been quite successful till date to come up with generated images and audio. Also, for our use case, to maintain a coherency in spatial features, Variational Auto Encoders (VAEs) have been even better.
If we start with a popular use case of a very popular cartoon series, Tom & Jerry, specially modified for autistic children, let’s consider a simple scene where tom is chasing jerry. On an image level, for the entire scene, the posture of tom and jerry will remain constant, only their location will vary in every subsequent image frame in the entire scene. Which means, only their spatial location with respect to the entire image background will vary and hence VAEs will have the potential to implement such a use case as VAEs helps to provide probabilistic descriptions of features or observations in latent spaces.
Machine generated text, or the story
Coming to text generation or story generation, recurrent neural networks like Long/Short Term Memory (LSTM) has been quite successful. Already, LSTM has been used to artificially generate chapters from popular novels or stories like Harry Potter and Cinderella. So, for a simple animated video story specially structured for autistic children, LSTM can be effective. Although Gradient Recurrent Units (GRU) can be the other alternative, but till date LSTM has been more successful, so the first preference will always be LSTM.
Machine generated audio or music
For music generation, GANs have been proved effective till date. For our use case, Natural Language Processing or NLP can used to determine the type of scene from the generated story, e.g. for the Tom & Jerry scene, it will be a chase scene. Based on this classification, Deep Convolution Generative Adversarial Networks (DCGAN) can be used to generate music which is relevant to such a chase scene and at the same time be soothing and enjoyable to such children!
Assembling everything together
Now if we can put all these discrete pieces of the puzzle together, we can come up with a completely machine generated animated video tailor-made for developing and improving cognitive abilities of children with autism. This will be a new progress in the field of Artificial Intelligence!
These machine generated videos can be trained on Neural Network in such a way that it can be a source of fun and enjoyment for this special group and at the same time reward their good behavior and educate them in a sensitive way without any human dependency.
Future scope and extension
As a future scope, if this approach is successful, the gaming industry can adopt usage of such a technology and with the help of reinforcement learning, can come up with machine generated video games and educational games specially designed for such children that can disrupt the entire gaming industry and can be a source of happiness for such children!
Person Identification via Multi-Modal Interface with Combination of Speech and Image DataJoy MustafiFounder and PresidentMUST ResearchAditya BhattacharyaLead AI/ML EngineerWest Pharmaceuticals
schedule 2 years agoSold Out!
Having multiple modalities in a system gives more affordance to users and can contribute to a more robust system. Having more also allows for greater accessibility for users who work more effectively with certain modalities. Multiple modalities can be used as backup when certain forms of communication are not possible. This is especially true in the case of redundant modalities in which two or more modalities are used to communicate the same information. Certain combinations of modalities can add to the expression of a computer-human or human-computer interaction because the modalities each may be more effective at expressing one form or aspect of information than others. For example, MUST researchers are working on a personalized humanoid built and equipped with various types of input devices and sensors to allow them to receive information from humans, which are interchangeable and a standardized method of communication with the computer, affording practical adjustments to the user, providing a richer interaction depending on the context, and implementing robust system with features like; keyboard; pointing device; touchscreen; computer vision; speech recognition; motion, orientation etc.
There are six types of cooperation between modalities, and they help define how a combination or fusion of modalities work together to convey information more effectively.
- Equivalence: information is presented in multiple ways and can be interpreted as the same information
- Specialization: when a specific kind of information is always processed through the same modality
- Redundancy: multiple modalities process the same information
- Complimentarity: multiple modalities take separate information and merge it
- Transfer: a modality produces information that another modality consumes
- Concurrency: multiple modalities take in separate information that is not merged
Computer - Human Modalities
Computers utilize a wide range of technologies to communicate and send information to humans:
- Vision - computer graphics typically through a screen
- Audition - various audio outputs
Adaptive: They MUST learn as information changes, and as goals and requirements evolve. They MUST resolve ambiguity and tolerate unpredictability. They MUST be engineered to feed on dynamic data in real time.
Interactive: They MUST interact easily with users so that those users can define their needs comfortably. They MUST interact with other processors, devices, services, as well as with people.
Iterative and Stateful: They MUST aid in defining a problem by asking questions or finding additional source input if a problem statement is ambiguous or incomplete. They MUST remember previous interactions in a process and return information that is suitable for the specific application at that point in time.
Contextual: They MUST understand, identify, and extract contextual elements such as meaning, syntax, time, location, appropriate domain, regulation, user profile, process, task and goal. They may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided).
Multi-Modal Interaction: https://www.youtube.com/watch?v=jQ8Gq2HWxiA
Gesture Detection: https://www.youtube.com/watch?v=rDSuCnC8Ei0
Speech Recognition: https://www.youtube.com/watch?v=AewM3TsjoBk
Assignment (Hands-on Challenge for Attendees)
Real-time multi-modal access control system for authorized access to work environment - All the key concepts and individual steps will be demonstrated and explained in this workshop, and the attendees need to customize the generic code or approach for this assignment or hands-on challenge.
No more submissions exist.
No more submissions exist.