YOW! Data 2018 Day 1

Mon, May 14
08:00

    Registration for YOW! Data 2018 - 45 mins

08:45

    Session Overviews & Introductions - 15 mins

09:00
  • Added to My Schedule
    keyboard_arrow_down
    Dean Wampler

    Dean Wampler - Stream All the Things!!

    schedule 09:00 AM - 09:50 AM place Wesley Theatre people 208 Interested

    Streaming data architectures aren't just "faster" Big Data architectures. They must be reliable and scalable as never before, more like microservice architectures.

    This talk has three goals:

    1. Justify the transition from batch-oriented big data to stream-oriented fast data.
    2. Explain the requirements that streaming architectures must meet and the tools and techniques used to meet them.
    3. Discuss the ways that fast data and microservice architectures are converging.

    Big data started with an emphasis on batch-oriented architectures, where data is captured in large, scalable stores, then processed using batch jobs. To reduce the gap between data arrival and information extraction, these architectures are now evolving to be stream oriented, where data is processed as it arrives. Fast data is the new buzz word.

    These architectures introduce new challenges for developers. Whereas a batch job might run for hours, a stream processing system typically runs for weeks or months, which raises the bar for making these systems reliable and scalable to handle any contingency.

    The microservice world has faced this challenge for a while. Microservices are inherently message driven, responding to requests for service and sending messages to other microservices, in turn. Hence, they are also stream oriented, in the sense that they must respond reliably to never-ending input. So, they offer guidance for how to build reliable streaming data systems. I'll discuss how these architectures are merging in other ways, too.

    We'll also discuss how to pick streaming technologies based on four axes of concern:

    • Low latency: What's my time budget for handling this data?
    • High volume: How much data per unit time must I handle?
    • Data processing: Do I need machine learning, SQL queries, conventional ETL processing, etc.?
    • Integration with other tools: Which ones and how is data exchanged between them?

    We'll consider specific examples of streaming tools and how they fit on these axes, including Spark, Flink, Akka Streams, and Kafka.

09:50
  • Added to My Schedule
    keyboard_arrow_down
    Simon Carryer

    Simon Carryer - Data is a Soft Science

    schedule 09:50 AM - 10:20 AM place Wesley Theatre people 202 Interested

    The perception of data science, and often the way it is taught, is like this: You have some nice, tidy data, you use the latest, coolest algorithm, and you get some super clever results. You know it’s good ‘cause your r-squared value is through the roof, and you could play checkers on your confusion matrix.

    But the reality is different. That nice, tidy dataset has to be wrangled out of a big, nasty production system that was built by a coffee-fueled maniac. Those cool results have to somehow be translated into a user interface in which it’s the 12th most important thing on the page, and you have to fight for every pixel. And in front of that production system, entering that data, clicking on that user interface, is a data scientist’s worst nightmare: People.

    As much as we might want to believe that data science is a pure “hard” science, about writing greek letters on chalkboards and stroking our chins, the truth is that what we do is more usefully thought of as a social science. Data science is a lens for understanding human behaviour. It is a tool for communicating with people. Data is a soft science.

    This talk is about how my background in Social Anthropology gave me a unique approach to doing data science. I’ll show how taking this view of data science led to some cool discoveries in some interesting projects. And I’ll talk about how, building accounting software at Xero, we’ve started on the journey towards building a “smarter” application. As we've done this, the hardest problems have not been about technical implementation, they’ve been about understanding the interface between these technologies and our users. Our data science problems at Xero, it turns out, are mostly about how to understand humans.

10:20

    Morning Break - 25 mins

10:45
  • Added to My Schedule
    keyboard_arrow_down
    Cameron Joannidis

    Cameron Joannidis - Machine Learning Systems for Engineers

    schedule 10:45 AM - 11:15 AM place Wesley Theatre people 212 Interested

    Machine Learning is often discussed in the context of data science, but little attention is given to the complexities of engineering production ready ML systems. This talk will explore some of the important challenges and provide advice on solutions to these problems.

11:15
  • Added to My Schedule
    keyboard_arrow_down
    Elaina Hyde

    Elaina Hyde - What happens when Galactic Evolution and Data Science collide?

    schedule 11:15 AM - 11:45 AM place Wesley Theatre people 209 Interested

    This talk will cover a short trip around our Milky Way Galaxy and a discussion of how data science can be used to detect faint and sparse objects such as the dwarf satellites and streams that helped form the galaxy we live in. The data science applications and algorithms used determine the accuracy with which we can make detections of these mysterious bodies and with the advent of greater cloud computing capability the sky is no longer the limit when it comes to programming or Astronomy

11:45
  • Added to My Schedule
    keyboard_arrow_down
    Tash Keuneman

    Tash Keuneman - The 80/20 of UX research: What qualitative data to use and when

    schedule 11:45 AM - 12:15 PM place Wesley Theatre people 213 Interested

    We'll talk about the one qualitative process that you should use 80% of the time. Learn how to translate your customer research into meaningful numbers that you can make product decisions on.

    How many customers do you have to talk to find 85% of the usability problems? How do you avoid the most common pitfalls that result in bad data? We'll go through that as well as the best practices you can use to confidentially calculate the extent of the problems you've found.

    You'll walk out of the conference room with UX skills that you can put into practice right away and a gift bag to boot.

12:15

    Lunch Break - 50 mins

01:05
  • Added to My Schedule
    keyboard_arrow_down
    Dana Bradford

    Dana Bradford - How to Save a Life: Could Real-Time Sensor Data Have Saved Mrs Elle?

    schedule 01:05 PM - 01:35 PM place Wesley Theatre people 205 Interested

    This is the story of Mrs Elle*, a participant in a smart home pilot study. The pilot study was aimed to test the efficacy of sensors to capture in-home activity data including meal preparation, attention to hygiene and movement around the house. The in-home monitoring and response service associated with the sensors had not been implemented, and as such, data was not analyzed in real time. Sadly, Mrs Elle suffered a massive stroke one night, and was found some time after. She later died in hospital without regaining consciousness. This paper looks at the data leading up to Mrs Elle’s stroke, to see if there were any clues that a neurological insult was imminent. We were most interested to know, had we been monitoring in real time, could the sensors have told us how to save a life?

    *pseudonym

01:35
  • Added to My Schedule
    keyboard_arrow_down
    Boris Savkovic

    Boris Savkovic - Machine learning applications for the autonomous/connected vehicle : perspectives, applications and methods

    schedule 01:35 PM - 02:05 PM place Wesley Theatre people 213 Interested

    The application of streaming and real-time data science/analytics to connected and autonomous vehicles is gaining traction around the world. Intelematics is an Australian leader and innovator in the field of telematics/connected vehicles as well as in big data traffic analytics, with Intelematics services used by Australian and overseas giants such as Ford, Toyota, Google etc.

    The topic of the talk is the application of streaming and big data analytics to autonomous and connected vehicles. Applications covered will include : ability to predict vehicle failures, forecast traffic conditions, automate vehicle insurance claims with automated crash/incident detection, deliver data into the vehicle (traffic signal states and forecasts) etc.

    The talk will give an outline of general trends as well as give some examples of concrete solutions that we have developed at Intelematics in this emerging field, both in Australia and overseas (US and EU) . The focus will be on the application of data science and algortihms, and the key role that these have to play in the emerging field of connected and autonomous vehicles. These relevant data streams bring a number of technical and non-technical challenges that will be discussed : complexity of dealing with geo-spatial and temporal data, safety and security, privacy, streaming nature of data, event-driven nature data, volume of data, complexity of relationships/patterns to be modelled etc.

    The underlying algorithms and techonlogies will be also be discussed in some detail.

    Link to company website :

    http://www.intelematics.com/

    Speaker bio :

    https://au.linkedin.com/in/borislav-savkovic-23969154

02:05
  • Added to My Schedule
    keyboard_arrow_down
    Gareth Jones

    Gareth Jones - Using Sentiment Analysis To Fill In The Gaps From User Surveys

    schedule 02:05 PM - 02:35 PM place Wesley Theatre people 211 Interested

    We put a year's worth of online help chat logs from a major Australian Superannuation website through Google's Natural Language API, to see what insights we could gain from the users. This talk will discuss how the Natural Language API works, and the underlying machine learning concepts, and also give you some ideas on how to make use of the information based on examples from our work. We'll compare the sentiment values with those expressed in exit surveys and find out how useful an indicator it can be.

02:35

    Afternoon Break - 20 mins

02:55
  • Added to My Schedule
    keyboard_arrow_down
    Wai Chee Yau

    Wai Chee Yau / Jeffrey Theobald - Deep Learning, Production and You

    schedule 02:55 PM - 03:25 PM place Wesley Theatre people 223 Interested

    Simply building a successful machine learning product is extremely challenging, and just as much effort is needed to turn that model into a customer-facing product. Drawing on their experience working on Zendesk’s article recommendation product, Wai Chee Yau and Jeffrey Theobald discuss design challenges and real-world problems you may encounter when building a machine learning product at scale.

    Wai Chee and Jeffrey cover the evolution of the machine learning system, from individual models per customer (using Hadoop to aggregate the training data) to a universal deep learning model for all customers using TensorFlow, and outline some challenges they faced while building the infrastructure to serve TensorFlow models. They also explore the complexities of seamlessly upgrading to a new version of the model and detail the architecture that handles the constantly changing collection of articles that feed into the recommendation engine.

    Topics include:

    • Infrastructure for continuously changing textual data
    • Deploying and serving TensorFlow models in production
    • Real-world production problems when dealing with a machine learning model
    • Data, customer feedback, and user experience
03:25
  • Added to My Schedule
    keyboard_arrow_down
    Hercules Konstantopoulos

    Hercules Konstantopoulos - The catastrophic consequences of not being awesome at plots

    schedule 03:25 PM - 03:55 PM place Wesley Theatre people 213 Interested

    “Hey boss, we made a totally rad deep learning algorithm! It trawls through the internet and literally tells the future!”

    “Yeah, cool. But do you have a plot I can show the board?”

    Data visualisation is the capstone of data science. As businesses collate massive and disparate data streams, and as algorithms become more complex, communicating results has become more important and more challenging than even before. We need to to start placing as much importance on accessible visualisation as we do on database architecture or algorithm design.

    In this talk I will present some core visualisation principles that I have developed over 15 years of experience visualising (literally) astronomical datasets, carbon emissions, behavioural analytics, even the odd basketball game. These can be employed by any data scientist to help their data tell the right story to the right people.

03:55
  • Added to My Schedule
    keyboard_arrow_down
    Shujia Zhang

    Shujia Zhang - Graph Neural Networks: Algorithm and Applications

    schedule 03:55 PM - 04:25 PM place Wesley Theatre people 214 Interested

    Artificial neural networks help us cluster and classify. Since "Deep learning" became the buzzword, it has been applied for many advances of AI, such as self-driving car, image classification, Alpha Go, etc. There are lots of different deep learning architectures, the most popular ones are based on the well known convolutional neural network which is one type of feed-forward neural networks. This talk will introduce another variant of deep neural network - Graph Neural network which can model the data represented as generic graphs (a graph can have labelled nodes connected via weighted edges). The talk will cover:

    • the graph (graph of graphs - GoGs) representation: how we represent different data with graphs
    • architecture of graph neural networks (GNN): the architecture of deep graph neural networks and learning algorithm
    • applications of GoGs and GNNs: document classification, web spam detection, human action recognition in video

04:25

    Afternoon Break - 20 mins

04:45
  • schedule 04:45 PM - 05:15 PM place Wesley Theatre people 164 Interested

    Bees are dying – in recent years an unprecedented decline in honey bee colonies has been seen around the globe. The causes are still largely unknown. At CSIRO, the Global Initiative for Honey bee Health (GIHH) is an international collaboration of researchers, beekeepers, farmers, and industry set up to research the threats to bee health in order to better understand colony collapse and find solutions that will allow for sustainable crop pollination and food security. Integral to the research effort is RFID tags that are manually fitted to bees. The abundance of data being collected by the thousands of bee-attached sensors as well as additional environmental sensors poses several challenges regarding the interpretation and comprehension of the data, both computationally as well as from a user perspective. In this talk, I will discuss visual analytics techniques that we have been investigating to facilitate an effective path from data to insight. I will particularly focus on interactive and immersive user interfaces that allow for a range of end users to effectively explore the complex sensor data.

05:15
  • Added to My Schedule
    keyboard_arrow_down
    Tomasz Bednarz

    Tomasz Bednarz - Visual Analytics on Steroids: High Performance Visualisation, Simulations and AI

    schedule 05:15 PM - 05:45 PM place Wesley Theatre people 128 Interested

    In the time that someone takes to read this abstract, another could solve a detective puzzle if only they had enough quantitative evidence on which to prove their suspicions. But also, one could use visualisation and computational tools like a microscope, to seek a new cure for cancer or predict hospitalisation prevention. In this presentation, we will demonstrate new visual analytics techniques that use various mixed reality approaches that link simulations with collaborative, complex and interactive data exploration, placing the human-in-the-loop. In the recent days, thanks to advances in graphics hardware and compute power (especially GPGPU and modern Big Data / HPC infrastructures), the opportunities are immense, especially in improving our understanding of complex models that represent the real- or hybrid-worlds. Use cases presented will be drawn from ongoing research at CSIRO, and Expanded Perception and Interaction Centre (EPICentre) using world class GPU clusters and visualisation capabilities.

05:45
  • Added to My Schedule
    keyboard_arrow_down
    Dr Eugene Dubossarsky

    Dr Eugene Dubossarsky - The Zen of Data Science

    schedule 05:45 PM - 06:30 PM place Wesley Theatre people 133 Interested
    What makes data science such a different field ? Why is it such a challenge to structure, manage and capture the value of data science ?
    This presentation will focus on the key issues around the practice of discovery, (“science”) and how it differs from the practice of building new things (“engineering”).
    Other questions addressed will include:
    How do organisations manage and leverage the value of data science ? What are the key “unknown unknowns” that managers miss so often, with disastrous results ?
    What are the cultural, procedural, managerial differences between “scientists” and engineers in a modern, data-driven workplace ?
06:30

    Conference Drinks & Networking - 60 mins

YOW! Data 2018 Day 2

Tue, May 15
08:45

    Session Overviews & Introductions - 15 mins

09:00
  • Added to My Schedule
    keyboard_arrow_down
    Andrea Burbank

    Andrea Burbank - Building a culture of experimentation at Pinterest

    schedule 09:00 AM - 09:50 AM place Wesley Theatre people 188 Interested

    A successful experimentation program consists of much more than mere randomization and measurement. How do you help stakeholders understand the right things to measure, avoid common pitfalls, and learn to rely on A/B tests as the best way to measure a new system or feature? Building a culture of experimentation and the right tools to support it is just as important as the statistics behind the comparisons themselves - and potentially much trickier to get right.

09:50
  • Added to My Schedule
    keyboard_arrow_down
    Tim Garnsey

    Tim Garnsey - Respecting privacy with synthetically generated "look-alike" data sets

    schedule 09:50 AM - 10:20 AM place Wesley Theatre people 188 Interested

    Safely handling data that contains sensitive or private information about people is a multi-million dollar problem at many companies. It adds time into the data engineering process, it can cost a lot in software licenses for specialised tools, and brings a range of reputational and legal risks.

    Recent advances in deep learning have prompted an interesting way to attack this problem. By fitting a certain class of model on a source data set that contains sensitive information, we can produce a generator that outputs a supply of synthetic "look alike" data. This output data will preserve many of the statistical relationships between fields as the source does, and offers mathematical guarantees around the identifiability of individuals in the source data set.

    This talk will provide an overview of the approach and show how it can speed data engineering effort and reduce risk.

10:20

    Morning Break - 25 mins

10:45
  • Added to My Schedule
    keyboard_arrow_down
    Fiona Tweedie

    Fiona Tweedie - On the quest for advanced analytics: governance and the Internet of Things

    schedule 10:45 AM - 11:15 AM place Wesley Theatre people 199 Interested

    Data scientists dream of crystal clear data lakes and perfectly ordered warehouses with comprehensive dictionaries, consistent formats and never a null value or encoding error to mar their analysis. The reality, however, is that the bulk of time on most data projects is spent sourcing and munging data before the exploration and analysis can begin. Governance is often presented as the solution to all data woes but all too often generates more meetings than results.

    The University of Melbourne is home to 8000 staff and 48000 students across seven campuses. Both researchers and professional staff recognise that data is going to be key to understanding this complex community and supporting its members. Sensor data collected from around the campuses promises the opportunity to analyse everything from demands on public transport to the impact of weather on coffee consumption. With researchers spread across ten faculties, there is a danger that multiple projects will collect fragmented data and the real power that comes from joining multiple datasets will never be realised. Conversely, overly prescriptive policies will date quickly and hamper innovation. Is it possible to satisfy both the desire to move rapidly to take advantage of new opportunities and the need to maintain data quality?

    This case study will present some of the IoT projects currently being explored at the University and examine the governance efforts that are being trialled to ensure the adoption of standards and future interoperability of devices and data.

11:15
  • Added to My Schedule
    keyboard_arrow_down
    Rohan Dhupelia

    Rohan Dhupelia - A Retrospective on Building and Running Atlassian’s Data Lake

    schedule 11:15 AM - 11:45 AM place Wesley Theatre people 206 Interested

    Atlassian’s strives to be a data driven company that builds collaborative software for teams. Two and a half years ago we launched our analytics platform (i.e. data lake) on AWS which is used by over 1500 internal users each month to gain insights and enforce decisions.

    In this talk we will present a retrospective on our analytics platform covering our blessings (what went well) and our mistakes (what could have been better) as well as talking about what potential next steps we might take to further improve the platform.

    This talk will cover both the technical aspect (i.e. architectural choices) as well as non-technical aspect (i.e. team organisation, our principals and mandate).

11:45
  • schedule 11:45 AM - 12:15 PM place Wesley Theatre people 197 Interested

    You must have heard it a few times that AI has beaten human in image recognition. Is that true? Have you seen it yourself? I am going to demonstrate Cyclops, an image recognition we built to recognise car model far better than any human.

    From here on, this talk will take you through our journey, how it's all began, why we built the early version of Cyclops and what was the outcome. Furthermore, how we used this technology to dramatically improve consumer experience and built many consumers facing products which we thought was not possible before.

    I will then dive down into technical details, starting from how we built Cyclops 1.0 with Tensorflow and how we overcame the training complexity with transfer learning. However, transfer learning comes with a limitation of directional invariance in which I will show what it is and how we overcame it with our novel solution.

    Next, I will show you that building a car recognition as complex as Cyclops 2.0 requires a more superior model and modification of our existing transfer learning technique. I will also take you to see problems we faced with low coverage when we are going deeper and how we solved them. I will then investigate how a distributed training can speed up the training process to make this practical.

12:15

    Lunch Break - 50 mins

01:05
  • Added to My Schedule
    keyboard_arrow_down
    Linda McIver

    Linda McIver - Kids can change the world with data

    schedule 01:05 PM - 01:50 PM place Wesley Theatre people 188 Interested

    From researchers perpetrating unspeakable acts with their datasets, to students trying to communicate experimental results with pie charts that don’t sum to 100, we have all seen data horror stories. But in my classes kids have worked with their communities to create real change using data and computation. We can band together to give all kids the opportunity to change the world. This is the story of kids. Of data. Of Computation. And of change. It’s the story of the Australian Data Science Education Institute.

01:50
  • Added to My Schedule
    keyboard_arrow_down
    Sachin Abeywardana

    Sachin Abeywardana - Trump Tweets (Fun with Deep Learning)

    schedule 01:50 PM - 02:20 PM place Wesley Theatre people 192 Interested

    In this talk I will introduce LSTMs. This is how deep learning deals with time series data. The dataset we will be focusing on is Trump's tweets. Using this data we will make a tweet generator in which we will train how to simulate Trump style tweets one character at a time.

    We will be using Keras to generate the Deep Learning model. Google colab will be used so that all attendees will be able to use a GPU.

02:20

    Afternoon Break - 20 mins

02:40
  • Added to My Schedule
    keyboard_arrow_down
    Grant Carlson

    Grant Carlson / James Binh Tham - Protecting Australia’s red meat industry with interactive data visualisation

    schedule 02:40 PM - 03:10 PM place Wesley Theatre people 179 Interested

    Anyone who has seen a computer simulation of the spread of a pandemic would appreciate the power of visualisation to manage the situation.

    In Australia there exists the National Livestock Identification System (NLIS), which is used to trace the movements of cattle in the event of a serious disease outbreak.

    Using Neo4j we will demonstrate a tracing exercise across a dataset of over 400 million movement events showing the power of interactive data visualisation.

03:10
  • Added to My Schedule
    keyboard_arrow_down
    Aidan O

    Aidan O'Brien - DevOps 2.0: Evidence-based evolution of serverless architecture through automatic evaluation of “infrastructure as code” deployments

    schedule 03:10 PM - 03:40 PM place Wesley Theatre people 178 Interested

    The scientific approach teaches us to formulate hypotheses and test them experimentally in order to advance systematically. DevOps and software architecture in particular, do not traditionally follow this approach. Here decisions like “scaling up to more machines or simply employing a batch queue” or “using Apache Spark or sticking to a job scheduler across multiple machines” are worked out theoretically rather than implemented and tested objectively. Furthermore, the paucity of knowledge in unestablished systems like serverless cloud architecture hampers the theoretical approach.

    We therefore partnered with James Lewis and Kief Morris to establish a fundamentally different approach for serverless architecture design that is based on scientific principles. For this, the serverless architecture stack needs to firstly be fully defined through code/text, e.g. AWS CloudFormation, so that it can easily and consistently be deployed. This “architecture as text”-base can then be modified and re-deployed to systematically test hypotheses, e.g. is an algorithm faster or a particular autoscaling group more efficient. The second key element to this novel way of evolving architecture is the automatic evaluation of any newly deployed architecture without manually recording runtime or defining interactions between services, e.g. Epsagon’s monitoring solution.

    Here we describe the two key aspects in detail and showcase the benefits by describing how we improved runtime by 80% for the bioinformatics software framework GT-Scan, which is used by Australia’s premier research organization to conduct medical research.

03:40
  • Added to My Schedule
    keyboard_arrow_down
    Atif Rahman

    Atif Rahman - Privacy Preserved Data Augmentation using Enterprise Data Fabric

    schedule 03:40 PM - 04:10 PM place Wesley Theatre people 174 Interested
    Enterprises hold data that has potential value outside their own firewalls. We have been trying to figure out how to share such data at a level of detail with others in a secure, safe, legal and risk mitigated manner that ensure high level of privacy while adding tangible economic and social value. Enterprises are facing numerous roadblocks, failed projects, inadequate business cases, and issues of scale that needs newer techniques, technology and approach.
    In this talk, we will be setup the groundwork for scalable data augmentation for organisations and visualising technical architectures and solutions around emerging technologies of data fabrics, edge computing and a second coming of data virtualisation.
    A self-assessment toolkit will be shared for people interested to apply it to their organisations.
04:10

    Afternoon Break - 20 mins

04:30
  • Added to My Schedule
    keyboard_arrow_down
    Gala Camacho

    Gala Camacho - Using Social Media Data to Explore Place Activity During the 2018 Gold Coast Commonwealth Games

    schedule 04:30 PM - 05:00 PM place Wesley Theatre people 147 Interested

    Why do people choose to live in one neighbourhood over another? Every day government makes decisions that can change the way a neighbourhood operates and feels. Understanding the impact that these decisions have is convoluted and hard to measure.

    In April, the Gold Coast held the 2018 Commonwealth Games. These events, usually advertised as urban renewal or regeneration projects, have a lasting impact on the neighbourhoods where they take place. Usually, there is a strong push to predic the impact of the games through economic assessments and surveys, however, once the games are happening, is that impact tracked? Sometimes, further economic assessments are produced years after the event which evaluate whether the impact was as expected. Can we do better?

    Social data can provide us with more timely evidence of whethere the event is activating the economy as expected, or maybe highlight issues that can be resolved.

    Using social media data we will explore Facebook places data during and after the games, ultimately generating a dashboard to help us visualise change by giving us the ability to filter and aggregate the data. We will manage the data using Python’s Pandas, generate quick visualisations using Plotly, and finish off by spinning up a dashboard using Plotly’s Dash.

05:00
  • Added to My Schedule
    keyboard_arrow_down
    Andrew Docherty

    Andrew Docherty - Know Your Neighbours: Machine Learning on Graphs

    schedule 05:00 PM - 05:30 PM place Wesley Theatre people 120 Interested

    Machine learning has become ubiquitous in many applications. There are many accessible tools available to apply standard machine learning models to make predictions on data.

    Typically, machine learning problems aim to predict something about an entity using some data about each entity. However, in the real world, entities - people, places or things - are connected to each other and can have complex interactions with their neighbours. We can use this information to improve our predictions, and to gain more insight into the network structure and how entities can affect each other.

    This talk will introduce machine learning on graphs, give some examples where graph approaches give large improvements over standard machine learning techniques, and will demonstrate some tools that make graph machine learning more approachable.

05:30
  • Added to My Schedule
    keyboard_arrow_down
    Katie Bell

    Katie Bell - Is the 370 the worst bus in Sydney?

    schedule 05:30 PM - 06:00 PM place Wesley Theatre people 108 Interested
    In Switzerland, people will be surprised at a bus that's 2min late. In Sydney, people will only consider it noteworthy if a bus is more than 20min late, and this varies greatly between routes and providers. So, how do Sydney bus routes stack up? And if we're talking about privatisation, how do the private bus providers stack up against the state busses?
    To answer these questions we need data… lots of data. Hooray for open government data! Transport for NSW publishes real-time information on the location and lateness of all public transport. Unfortunately it's ephemeral – there is no public log of historical lateness for us to analyse. To gather the data I needed I had to fetch, log and aggregate ephemeral real-time data that was never intended to be used this way. There are random gaps and spontaneous route or timetable changes for special events, roadworks or holidays. Even with noisy data, the patterns start to emerge across months and we can start to answer some questions. The 370 bus route is one of the most complained about routes in Sydney, it even has it's own Facebook group of ironic fans... but is it really the worst bus? Let's look at the data.

Workshop Day 1

Wed, May 16
08:00
  • Added to My Schedule
    keyboard_arrow_down
    Noon van der Silk

    Noon van der Silk / Noon van der Silk - Deep Learning Workshop

    schedule 08:00 AM - 04:00 PM place Room 1 shopping_cart Reserve Your Seat

    Venture into deep learning with this 2-day workshop that will take you from the mathematical and theoretical foundations to building models and neural networks in TensorFlow. You will apply as you learn, working on exercises throughout the workshop. To enhance learning, a second day is dedicated to applying your new skills in team project work.

    This hands-on workshop is ideal for both data science and programming professionals, who are interested in learning the basics of deep learning and embarking on their first project.

Workshop Day 2

Thu, May 17
08:00
  • Added to My Schedule
    keyboard_arrow_down
    Noon van der Silk

    Noon van der Silk / Noon van der Silk - Deep Learning Workshop...continued

    schedule 08:00 AM - 04:00 PM place Room 1 shopping_cart Reserve Your Seat

    Venture into deep learning with this 2-day workshop that will take you from the mathematical and theoretical foundations to building models and neural networks in TensorFlow. You will apply as you learn, working on exercises throughout the workshop. To enhance learning, a second day is dedicated to applying your new skills in team project work.

    This hands-on workshop is ideal for both data science and programming professionals, who are interested in learning the basics of deep learning and embarking on their first project.