We are your data science workshop.

GPT-3: What is It?

Recently you might have come actress GPT-3, and some of the cutting edge tasks this innovation in artificial intelligence can carry out. If you are at all on tech twitter, there is absolutely no way you avoided the hype.

Some of the things people have demonstrated thus far with GPT-3 is dictating the layout of a webpage they wanted built using natural language and having the model code the actual front end of the website. Or, using the model to generate a full article about how humans shouldn’t be worried about robots any time soon.

Like self-driving cars, or other promises of artificial intelligence, it’s easy to get caught up in the hype. So we wanted to share a bit more about GPT-3, what it is, and what it can do. And, perhaps, find a use case for your organization.

Background

GPT-3 is a project of OpenAI, a California based AI research laboratory. Initial backing for OpenAI came from Elon Musk, Peter Thiel, and other Silicon Valley technologists. The mission of the organization is “to ensure that artificial general intelligence benefits all of humanity.”

Among their projects is the Generative Pre-trained Transformer, or GPT. As the name suggests, the latest release is the third iteration of the model, and is a massive leap forward from its predecessors. Or, any other model from other labs for that matter. GPT-3 is able to run on 175 billion parameters. For context, this is 10 times the number of parameters of the next biggest model, Microsoft’s Turing NLG.

GPT-3 is a deep learning model that leverages neural networks. The model was trained mostly on the Common Crawl dataset, which is essentially data pulled from websites on the internet on a monthly basis. In addition, it used data from book transcripts, and even Wikipedia.

What Does it Do?

GPT-3 is a model that focuses on natural language. In the simplest terms, you can pose a question to it, and have it respond back like any human could. 

The most impactful use cases for GPT-3 are probably still being dreamed up by startups all over the world. But, some of the obvious ways companies can use GPT-3 are for chatbots, answering customer support questions, or even translating plain English into SQL queries or regex expressions.

GPT-3 can also write some code, including CSS and Python.

How To Get Access

OpenAI is providing API access to GPT-3 so anyone can build applications on top of it. You can join the waitlist to get access by filling out a form with the use case you have in mind.

Summary

There is certainly a lot of hype around GPT-3, and we’ve yet to see killer applications built on it. Like any new technology, it will follow the traditional hype cycle. But the technology is clearly a massive improvement on its predecessors and holds massive potential.

Five Good Uses of Data Science in Products

There was a period, not all that long ago, where startups pitched themselves first as a machine learning or artificial intelligence company, using these technologies to solve complex problems and provide a unique user experience. Now, data science methodologies are much more ubiquitous, that for many new companies and products in specific sectors, to even think about not leveraging them would be heretical.

We all interact with data science daily in the products we use. Like any well-implemented product feature, it blends in seamlessly with the user experience. As a user, you don’t need to know what technology is running in the background of the products you use. You want them to solve your headaches, or provide you joy.

Here is our list of five good uses of machine learning and data science in products,

Ocado

The thing that always turned me off to shopping for food online is that there is a flow to a supermarket or grocery store. You walk through the various aisles, and the food on the shelves speak to you, catch your attention, make you think of a recipe that you want to try. You may start with a list, but you always end up finding something new that you want to try out.

Ocado is one of the leading employers of data scientists and engineers (in fact our data scientists Jeremy and Johan hail from Ocado). AI and machine learning underpin all of Ocado, including factory layout, driver logistics, customer feedback analysis, responding to customer complaints, and the shopping experience. Ocado technology also helps users to navigate through their shopping more efficiently, having the right next product suggested to them to help them get their shopping done better and quicker. Or, more cynically, so you buy more.

Smart Compose in Gmail

I am a nervous emailer. I’ll often write something and go over it three or four times, changing tiny details, because it doesn’t sound right to me. That all changed when smart compose came around. Somehow, the machine predicting what I should say gave me more confidence to say it.

While that might not be the exact use case or problem to be solved when they started building the product, it does make it one of my favourite features of G Suite. I’d imagine for many power users, and people who live in their inbox, it presents a considerable amount of time savings.

When I first came across this feature, I thought the UI would be a bit awkward, as you have to hit tab to utilise the suggestion. However, in my experience, it fits in quite nicely with how I type. And now as I tap this blog post draft out in Google Docs I wonder when they will bring this to other parts of the G Suite.

The tech behind Smart Compose is pretty impressive. There are many challenges the Google team needed to overcome, including speed (it needs to suggest quicker than people can type after all), scale (providing the right predictions for a given user), and reducing bias in the suggestions.

It uses neural networks to take into account contexts, such as email subject and prior correspondence, and predict what the next phrase might be. They have an excellent blog writeup here on the technology.

Face Grouping in Google Photos/Other Photo Services

This post might give me away as a Google product power user. I love the facial grouping of Google photos. It makes finding the right picture of people, in a sea of the millions of photos we all have on our phones, super quick. I am always impressed by how well it groups people, particularly with my kids. The technology can connect their newborn photos with them as a toddler, even as I struggle to remember” is that Frankie or Archie in this one?” It can also distinguish my cat from the many other cat photos I have on my phone (don’t ask).

This facial recognition technology used across product and features within Google, and they allow developers to deploy the technology in their products, for instance, with the Firebase ML Kit.

Spotify Song Recommendations

I recently switched from the Google Play streaming service to Spotify (see, I can use non-Google products). One of the reasons it took me so long to do so was the headache of having to build a whole new library of music in Spotify. I didn’t want to go through it all and follow my favourite artists. What really surprised me when I made the move was how quickly, and how little data was actually required for Spotify to fairly accurately understand my musical tastes and actually start suggesting to me artists and songs that I frequently listened to on Google Play.

There are a few technologies and techniques Spotify uses to predict your musical tastes and create your tailored playlists. First is collaborative filtering, which makes recommendations to you based on crossover with other listeners with similar preferences. Spotify also uses natural language processing (NLP) and scours the internet, and tags songs based on how frequently they are mentioned alongside other artists and songs. The third method is raw audio processing and recommending similar songs based on like tempos, key and signatures. (more on these methodologies here).

Wealthfront ‘roboadvisor’

Financial services is an area ripe for the application of machine learning and other data science techniques. The vast amounts of available data, along with the inefficiencies, fraud, waste and high fees, make it particularly exciting as a wave of financial technology startups turns the space on its head.

My favourite consumer application in this area thus far is Wealthfront. It automatically builds users a balanced portfolio of exchange-traded funds based on risk profile. It even rebalances your portfolio for you to maximise efficiency. They have also released new features to help with financial planning, such as helping set budgets for when you want to buy a house, start a family, make large purchases, even plan to take an extended holiday. It plugs in all financial accounts you have, your current portfolio and risk preferences, and market data to help you prepare.

Wealthfront’s model allows more consumers to have access to financial planning, advice and portfolio management for significantly lower fees. Previously you would have to pay financial advisors to help you budget, and generally, they require clients to have a minimum net worth. To manage a balanced portfolio, you’d have to either do it yourself, and pay fees to whichever account manager you had, and also have to remember to rebalance your portfolio, and change it as your risk profile changes. Instead, automation, data and machine learning helps you accomplish all this at a fraction of the cost.

A Data Science Workshop for BerryWorld

Data Science Data Mettle Workshop Training Testimonial

BerryWorld are a leading business in international berry breeding and marketing, and have been operating for over 20 years. As their business has grown, so too has their desire to utilise their data to maximise their business output and to streamline their processes, which is where we came in. They recently expanded their Data Science team and invited us to come in and run a workshop to bring additional expertise and to assess some of the main challenges and questions they were facing from a new perspective.

Before the data science workshop, their Data Science team provided us with a list of questions that they were struggling with or particularly interested in addressing, so the day started by going through this list. The questions were quite general by design, and a main priority for the team was the provision of unbiased answers that were not specifically tailored to their specific tech stack or problems. This approach enabled a side-by-side comparison to their own solutions and workflows, and to thereby assess how our recommendations compared to their approach, with the aim being to expand their toolkit with more practical and straightforward solutions. The topics we covered were broad in scope, ranging from recommended tooling, to delivery and deployment of Data Science solutions.

The rest of the day was spent addressing specific business problems. This turned into a very successful brainstorming session with input from all sides. Effective Data Science solutions requires extensive domain knowledge, which they have an ample supply of. Our role was to provide an outside perspective and creative thinking, so in collaboration with the team, we were able to come up with several promising approaches to their problems.

At the end of a very enjoyable and fruitful day we had covered a wide range of technical topics, and the experts at BerryWorld were equipped with fresh ideas and solutions to apply to their data. In their own words:

“We think your suggestions to our workflow and approach to solving problems in the business was insightful and will be a useful benchmark with the direction we will look to take in the future. We liked you did it in a quite informal way which helped to explain our and your ideas as well.” Sergio Astorga, Lead Data Scientist, BerryWorld

_data journey

Your data can tell you a lot about your customer's journey. Our services can provide you with the information and tools that you need to match your services to customers.