• keepthepace@slrpnk.net
    link
    fedilink
    arrow-up
    0
    ·
    4 months ago

    It is called finetuning. I haven’t tried it but oobagooba’s text-generation-webui has a tab to do it and I believe it is pretty straightforward.

    Fine tune a base model on your dataset and then tou will then need to format your prompt in the way your AIM logs are organized. e.g. you will need to add “<ch00f>” add the end of your text completion task. It will complete it in the way it learnt it.

    If you don’t have a the GPU for it, many companies offer fine-tuning as a service like Mistral

  • PerogiBoi@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    4 months ago

    Why would you want this??? Anything I wrote from 16 years ago is so beyond cringey. You must have been a stellar kid.

    • corsicanguppy@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      4 months ago

      I have 26 years of saved outgoing email.

      Recently I needed to redo a fix I learned about in 1998 and implemented then. I implemented it again to install a crappy software project that from its composition canNOT have been from before the post-y2k firing of so many mentors.

      Only remembered after 3 hours of searching, saving myself another few hours and surely a nervous breakdown. But, after filtering AD on the client end, the project installed easily.

      That’s the best example, but the things I don’t discover I answered already on Stackoverflow I discover I answered years ago in email.

  • will_a113@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 months ago

    Putting aside why you’d want to do this, it’d be pretty easy, actually. You’d still use a big model like GPT4 or Claude as your “base” but you would do two things:

    • Give it a knowledge base using your conversatons. You can manually vectorize them into a key-value database like Pinecone and build yourself an agent using a toolchain like Langchain, or just use a service (OpenAI Agents lets you upload data from your browser)
    • Have one of the big LLMs (with a large context size) ingest all of those conversations and build out a prompt that describes “you”

    you would then

    • Feed that generated prompt (with your own edits, of course) back into either your custom Langchain agent or OpenAI Agent
    • ch00f@lemmy.worldOP
      link
      fedilink
      arrow-up
      0
      ·
      4 months ago

      Because I communicated with a lot of people over AIM? It’s actually more than just high school. Covers 2004 to around 2012. Also it’s 64mb zipped. Actual size is much larger.