Page 36 - GCN, August/September 2018
P. 36

                                  Machine learning is an application of artificial intelligence that enables systems to process data and learn on their own without being explicitly programmed to reach a specific solution.
Artificial intelligence is a broader branch of computer science that deals with simulating human intelligence in computers and giving them the ability to adapt to different situations.
Deep learning is a subset of machine learning that uses multilayered neural networks to tackle complex tasks, such as object detection or speech recognition.
Neural networks are computer systems that are loosely modeled on the human brain and connect thousands or even millions of simple processing nodes to learn from massive datasets.
easier for the state to identify people who need to provide clarification on their tax filings.
The model was used for the previous tax-filing season, and officials expect to use the education model this fall. It was trained with data on students who did not graduate from high school, including information on their school performance, demographics for the area where they live and other variables. The model then assigns a low, medium or high risk to current students, and school officials can use that information to conduct targeted intervention for at-risk students, Iyer said.
Self-sorting data
New York City’s Center for Innovation through Data Intelligence has explored using data to address economic and health issues. Its most recent study analyzed young adults who successfully transitioned out of homelessness and assigned them to one of six groups: frequent jail stays, consistent supportive housing, consistent subsidized housing, earlier homeless experience, later homeless experience and minimal service use.
“It helps us to predict who could fall into that group, but it also helps us to understand what the resources of each of these groups are,” said Maryanne Schretzman, the center’s executive director.
Sorting individuals into the groups required some serious data wrangling. The center created profiles using real data from 8,795 individuals, which required gathering and protecting sensitive data from multiple sources. They included the Department of Youth and Community Development, the Department of Homeless Services, the Administration for Children’s Services, jails and hospitals. The center used SAS Link King software to bring the datasets together.
That sensitive data never left the city government’s intranet, and it was transferred from one location to the next using an encrypted file transfer
 year to deepen its understanding of the machine learning and AI landscape. From the resulting conversations with vendors, it became clear that the state wasn’t taking advantage of the technologies’ potential.
“There is a huge gap between what it can do and what it’s being used for today,” Iyer said.
But Illinois isn’t diving in headfirst without a plan. Iyer said there are two important considerations when it comes to machine learning: architecture (data engineering) and science (data science).
Accordingly, officials are streamlining data sharing and optimizing datasets for machine learning. They also plan to create a data-sharing platform to help facilitate machine learning projects. It would make datasets available through application programming interfaces, thanks to data-sharing agreements the state has established among agencies. Those agencies would have access to Python, R and other statistical packages
via a centralized platform, and the machine learning models could also be made available via an API, Iyer said.
The platform’s details are far from finalized, though. For instance, officials still haven’t decided whether it will be cloud-based or on-premises. “It depends on the sensitivity of the data and where the platforms are for supporting requirements,” Iyer said.
The absence of a formal technology platform has not stopped the state from launching a couple of machine learning projects. The Illinois Department of Revenue is using machine learning to help predict fraud, and the State Board of Education has tapped the technology to better predict which students will struggle academically and potentially drop out.
Iyer said the model for tax fraud was trained to find patterns in historical data where fraud had been found. The model assigns a fraud probability to taxpayers and flags their tax returns, making it
36 GCN AUGUST/SEPTEMBER 2018 • GCN.COM















































































   34   35   36   37   38