Hi,

I’d like to explore the possibilities of training a LM to learn a specific programming language so he can be used as co-pilot in that context. Language is a niche language (http://pharo.org), and there is no existing model nowadays knowing it (also, I want to make some extra tweaks, once I have it).
Thing is… I have no idea where to start! :)

Any hint where can I learn the ropes?

Thanks!

  • abhibeckert@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 months ago

    there is no existing model nowadays knowing it

    Yeah there is - GPT-4 is very familiar with Pharo. And you’re not going to be able to train anything better than that yourself. OpenAI said it cost about a hundred million dollars in hardware and electricity to train the GPT-4 model. I assume you don’t have a budget like that?

    Other major models are probably pretty good at it too, but in my experience GPT-4 seems to be the best one especially in terms of code generation. So, I recommend starting with that one.

    Sign up for ChatGPT Plus, so you can use GPT-4 instead of GPT-3.5 (GPT-4 is a lot better), and just say "explain this pharo code: " then paste in a block of code.

    Here’s an example of a simple chat I had about Pharo with GPT-4. I started by asking it to explain some sample code I found from one of the examples on the Pharo website, then I asked if it could be improved (it suggested some good improvements), and then I asked it how to add a feature:

    https://chat.openai.com/share/fbe7325c-498f-44b0-ba48-c433a15b4a63

  • j4k3@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 months ago

    It’s going to seem so wrong, but… stable diffusion and porn is the easiest way to figure out the basics of modifying a model. It is easier to spot mistakes and the number of examples to try is enormous. This space is packed with examples running on basic low level hardware and you’ll find lots of easy to follow examples. The bar is much higher when it comes to training examples for text or code. Most examples are based on proprietary toolchains or are very basic getting started guides.

    • abhibeckert@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      edit-2
      10 months ago

      … stable diffusion is a diffusion model - OP wants a language model. They don’t work even remotely the same way.

      • rufus@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        10 months ago

        I’d agree. And the toolchains aren’t proprietary. But they are different and you can transfer only a small amount of knowledge from one to the other. The concept ‘training an AI model’ is the same. PyTorch, Linux and the graphics card are the same. So you can lean something. But it’s another kind of AI model and the software stack you need to use also differs.