Stability AI launches StableCode, an LLM for code generation

Go to our on-demand library to view VB Transform 2023 sessions. Register Here

Stability AI is well known for its Stable Diffusion text-to-image rendering model, but that’s not all the prolific AI startup is interested in developing. Stability AI is now also getting into code generation.

Today Stability AI announced the first public release of StableCode, a new open major language model (LLM) designed to help users create programming language code. StableCode is available at three different levels: a base model for general use cases, an instruction model, and a model with a long context window that can support up to 16,000 coins.

The StableCode model leverages the initial programming language data from the open source BigCode project, with additional filtering and fine-tuning from Stability AI. StableCode will initially support development in Python, Go, Java, JavaScript, C, markdown and C++ programming languages.

“What we want to do with this type of model is to replicate what we did for Stable Diffusion, which helps everyone in the world become an artist,” Christian Laforte, head of research for Stability AI, told VentureBeat. an exclusive interview. “We want to do the same with the StableCode model: basically anyone with good ideas. [and] maybe he has a problem, to be able to write a program to solve that problem.”


VB Transform 2023 Optional

Missed a session from VB Transform 2023? Sign up to access the on-demand library for all our featured sessions.

Register now

StableCode: Built on BigCode and big ideas

Any LLM training is based on data and for StableCode this data comes from the BigCode project. Using BigCode as the basis for an LLM generator AI code tool is not a new idea. HuggingFace and ServiceNow released the open StarCoder LLM, which is basically based on BigCode, in May.

Nathan Cooper, principal research scientist at Stability AI, explained to VentureBeat in an exclusive interview that the StableCode training includes significant filtering and cleaning of BigCode data.

“We love BigCode; they do a great job on data governance, model governance, and model training,” Cooper said. “We took the datasets and applied additional filters for quality, as well as to generate the large context windowed version of the model, and then we trained it on our set.”

Cooper said Stability AI also implements a number of training steps beyond what’s in the basic BigCode model. These steps included sequential training in specific programming languages.

“It takes a very similar approach. [to what’s] It’s done in the field of natural language, where you start with the preliminary training of a generalist model and then fine-tune it in a specific set of tasks, or in this case, languages,” Cooper said.

StableCode’s longer token length is a game changer for code generation

Looking beyond the BigCode foundation, the long-context version of StableCode can provide significant benefits to users.

The long context window version of StableCode features a 16,000-coin context window, which Stability AI claims is larger than any other model. Cooper explained that the longer context window enables the use of more specific and complex code generation prompts. This also means that a user can have StableCode look at a medium sized codebase with multiple files to help understand and generate new code.

“You can use this longer context window to give the model more information about your codebase and what other functions are defined in other files,” Cooper said. “So when it proposes code, it can be more tailored to your codebase and needs.”

Threading in better code generation with rotary position embedding (RoPE)

StableCode, like all modern generative AI models, is based on a transformer neural network.

Instead of using the ALiBi (Linear Biased Attention) approach to position outputs in a transformer model, which is the approach StarCoder uses for its open generative AI model for coding, StableCode uses an approach known as rotary positioning (RoPE).

Cooper said the ALiBi approach in transformer models tends to weigh current tokens more than past tokens. According to him, this is not an ideal approach to code because, unlike natural language, code does not have a specific narrative structure with a beginning, middle, and end. Code functions can be defined for any point in an application flow.

“I don’t think coding fits the idea of ​​making the present more important than the past, so… we use RoPE, [which] It doesn’t have that kind of prejudice that you weigh the present more than the past.

It’s still early for StableCode and the purpose of the first release is to see how developers will get and use the model.

“We will engage and work with the community to see what great guidelines they come up with and explore the productive developer space,” Cooper said.

VentureBeat’s mission To be a digital town square for technical decision makers to learn about transformative enterprise technology and operations. Explore our briefings.

#Stability #launches #StableCode #LLM #code #generation

Leave a Comment