Skip Navigation
InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)GO
Posts
20
Comments
0
Joined
2 yr. ago
math @lemmy.sdf.org

yang-mills mass gap

The Ocho @lemmy.sdf.org

pogo stick high jump

The Ocho @lemmy.sdf.org

jet pack racing!

The Ocho @lemmy.sdf.org

high-level air hockey strats

Speedrun @lemmy.sdf.org

Hydro Thunder(Arcade) The Far East 1:19.46

Speedrun @lemmy.sdf.org

Super Mario Odyssey Any% (2 Player) in 56:54

The Ocho @lemmy.sdf.org

Highlights - Cycle Ball - Gold Medal Final | 2021 UCI Indoor Cycling World Championships

The Ocho @lemmy.sdf.org

Underwater rugby

The Ocho @lemmy.sdf.org
Motorcycles @lemmy.sdf.org

Milwaukee Senior TT - Highlights | 2023 Isle of Man TT Races

math @lemmy.sdf.org

open source math textbooks

some links are broken but otherwise good. Post your open source math textbooks here

WorldNews @lemmy.sdf.org

After state board approves first taxpayer-funded Catholic school, Hindus seek same | KGOU

math @lemmy.sdf.org

some older machine learning books

math @lemmy.sdf.org

"Prompt Gisting:" Train two models such that given inputs "Translate French

<G1>

<G2>

" and "

<G1>

G2>The cat," then G1 and G2 represent the entire instruction.

cross-posted from: https://lemmy.sdf.org/post/36227

Abstract: "Prompting is now the primary way to utilize the multitask capabilities of language models (LMs), but prompts occupy valuable space in the input context window, and re-encoding the same prompt is computationally inefficient. Finetuning and distillation methods allow for specialization of LMs without prompting, but require retraining the model for each task. To avoid this trade-off entirely, we present gisting, which trains an LM to compress prompts into smaller sets of "gist" tokens which can be reused for compute efficiency. Gist models can be easily trained as part of instruction finetuning via a restricted attention mask that encourages prompt compression. On decoder (LLaMA-7B) and encoder-decoder (FLAN-T5-XXL) LMs, gisting enables up to 26x compression of prompts, resulting in up to 40% FLOPs reductions, 4.2% wall time speedups, storage savings, and minimal loss in output quality. "

math @lemmy.sdf.org

Taming AI Bots: Prevent LLMs from entering "bad" states using continuous guidance from the LLM ("is this good? bad?") to avoid bad states.

math @lemmy.sdf.org

The TeX book

math @lemmy.sdf.org

open source data visualization books

math @lemmy.sdf.org

The space of homogeneous probability measures on $\overline{\Gamma \backslash X}_{\max}^{S}$ is compact

math @lemmy.sdf.org

Automorphic number theory

math @lemmy.sdf.org

NVIDIA's everything 2 anything