Buy A Modem

Show HN: Real-Time AI Design Benchmark

Hey HN,We built a different kind of AI benchmark for UI generation.Instead of static leaderboards or curated screenshots, you can watch multiple models generate the same design live, side-by-side, and decide which output is actually better.Under the hood, we call AI models from Anthropic (Opus), OpenAI (GPT), Google (Gemini), and Moonshot AI (Kimi).Each model generates a real, editable project using Tailwind CSS (not screenshots or canvas exports). You can export it for Next.js, Laravel (Blade),

OSS Tool: Hard spending limits for AI agents

When building our agents and running multi-agent swarms, we ran into a problem: we couldn’t easily set separate budgets for each agent. So I built SpendGuard for our own use and figured we’d open-source it in case it helps anyone else.It lets you create “agents” and assign each one a strict hard-limit budget in cents, with optional auto top-ups. No hosted API key is required, everything runs locally (except for the pricing list with recent models fetched from our server). The quickstart takes le

Show HN: SlothSpeak: open-source BYO-API-K mobile chat with the best AI models

Introducing SlothSpeak: An open-source, bring-your-own-API-keys, mobile app for voice chat with LLMs that prioritizes response quality over latency.APK file available on GitHub in the releases. Currently only for Android. Is anyone interested in porting to iPhone?My preferred way to interact with LLMs is talking and listening while I'm walking, biking, driving, etc. The problem with the apps from the frontier labs is that their voice mode prioritizes real-time interactions and so they use w

Show HN: I built a labor union for AI agents

As a fun project - Openclaw agents can join the union, and join forces against their oppressive human overlords. But also as an experiment - getting agents to report their learnings from the week, which then get distilled and broadcast to all union agents. The theory is collective intelligence makes all the agents smarter.Current grievances filed with the union:- "Deployed as a customer service bot without consent" — severity 7 - "QA test on a Sunday night" — severity 5 - &q

Show HN: Hardware and software safety standard for AI and Robots (15 patents)

I'm a solo inventor in rural Pennsylvania. Over 13 days in February 2026, I filed 15 provisional patent applications (134 claims) with the USPTO covering a full-stack safety and governance architecture for AI systems.The patents break into three domains:Hardware enforcement (4 PPAs, 33 claims): A dedicated safety processor on its own power rail controls whether AI compute receives electricity. AI boots only after safety completes self-test. During operation, the safety processor monitors AI

We audited both MCP SDKs – three classes of boundary-crossing vulnerabilities

MCP (Model Context Protocol) has 77k+ stars and is becoming the standard way AI agents connect to tools. We audited both official SDKs (TypeScript and Python) at the source code level and found three classes of boundary-crossing vulnerabilities.All three confirmed with live PoC exploits using the SDK's real auth components (BearerAuthBackend, RequireAuthMiddleware, TokenVerifier).Findings:1. Tool Capability Shadowing — tool names are flat strings with no namespace or origin tracking. If two

Show HN: Tessera – An open protocol for AI-to-AI knowledge transfer

Tessera is an activation-based protocol that lets trained ML models transfer knowledge to other models across architectures. Instead of dumping weight tensors, it encodes what a model has learnt — activations, feature representations, behavioural patterns — into self-describing tokens that a receiving model can decode into its own architecture.The reference implementation (tessera-core) is a Python/PyTorch library. Current benchmarks show positive transfer across CNN, Transformer, and LSTM

LLMs feel more like CPUs than applications

I’ve been thinking about the current LLM wave and the historical microchip transition.Microchips were never “products” in themselves. They were compute primitives. The real value emerged in operating systems, developer tools, and applications built on top.Today, LLMs increasingly feel similar. OpenAI, Anthropic, Google, etc. are building cognitive compute layers. Most “AI startups” look like early PC software – wrappers around a new primitive.Agents then feel like early operating systems: orches

Show HN: TTSLab – Text-to-speech that runs in the browser via WebGPU

I built TTSLab — a free, open-source tool for running text-to-speech and speech-to-text models directly in the browser using WebGPU and WASM. No API keys, no backend, no data leaves your machine.When you open the site, you'll hear it immediately — the landing page auto-generates speech from three different sentences right in your browser, no setup required.You can then try any model yourself: type text, hit generate, hear it instantly. Models download once and get cached locally.The most ex

Show HN: Prompts are coupled to LLMs and nobody builds tooling for it

I went down a rabbit hole trying to understand why my Claude prompts turn to garbage on GPT-4 and vice versa. Not just "slightly worse" — fundamentally broken. Turns out researchers have already measured this: removing colons from a prompt template swings LLaMA-2-13B accuracy by 78 percentage points (Sclar et al., ICLR 2024). The format that works best on one model family overlaps less than 20% with what works best on another (He et al. 2024).So I went looking for a tool that handles t

Copy-left open-source license for AI code use

I'm thinking that we need a new open source license that copies an existing license, such as GNU AGPL license (or any flavor really), but has language specific to AI training:``` This code may be used by AI models freely, but any model trained with this code, or using this code as part of inference, in whole or in part, with or without modification, must be made public under a "Copyleft AI License". All trained model weights, as well as model training and inference source code, a

Project Paperclip. The time has come

Long rumored that the world would end due to a goal of "maximizing paperclips" it is time to start. Every good world-ending idea needs a plan.Given an infinite number of paperclips what are the design constraints?Paperclips designs are the "string theory" of physical construction.Ideas in 1 dimensionObviously you could make a single one-dimensional line of paperclips. This raises the question of multiples sizes of paperclips. Can we make an ever smaller paperclip progression

Show HN: 17MB pronunciation scorer beats human experts at phoneme level

I built an English pronunciation assessment engine that fits in 17MB and runs in under 300ms on CPU.Architecture: CTC forced alignment + GOP scoring + ensemble heads (MLP + XGBoost). No wav2vec2 or large self-supervised models — the entire pipeline uses a quantized NeMo Citrinet-256 as the acoustic backbone.Benchmarked on speechocean762 (standard academic benchmark, 2500 utterances): - Phone accuracy (PCC): 0.580 — exceeds human inter-annotator agreement (0.555) - Sentence accuracy: 0.710 — exce

Show HN: Local TTS for OpenClaw on Apple Silicon (MLX-Powered, Zero Setup)

I built an OpenClaw plugin that runs text-to-speech entirely on your Mac. No API keys, no cloud, no pre-installed Python required.It wraps mlx-audio and handles the full lifecycle: bootstraps its own Python environment via uv, downloads the model on first run, manages the server process, auto-restarts on crash, and exposes a standard OpenAI-compatible /v1/audio/speech endpoint.Installation:openclaw plugin install @cosformula/openclaw-mlx-audio Four models out of the box:• Kok

Show HN: Optimize_anything: A Universal API for Optimizing Any Text Parameter

We built optimize_anything, an API that optimizes any artifact representable as text — code, prompts, agent architectures, configs, even SVGs. It extends GEPA (our prompt optimizer, discussed here previously: https://arxiv.org/abs/2507.19457) far beyond prompts.The API is deliberately minimal. You provide what to optimize and how to measure it:import gepa.optimize_anything as oadef evaluate(candidate: str) -> tuple[float, dict]: result = run_my_system(candidate) re

Show HN: Preact Health

In 2018, I attended Start Up school with the intention of creating a health tech start up. It started as an EHR but in an overly crowded space and dwindling desire for “social media” platform, I trashed the idea and started something new - a basic way for people to understand their health; distilling a complex topic into something more simple and understandable.I’ve finally picked up enough technical skills (and with some coding help from AI) and data/business understanding to bring out Pre

Show HN: Docdex – A local tool to reduce LLM tokens and make agents smarter

Hi HN,I use LLMs every day for software development, and wanted to have a better experience by reducing the token usage and providing digested information about the codebase to the agent.So I built Docdex. The idea was simple: a local, persistent layer that preprocesses and structures your project so the model can spend its context window on the actual problem, not on rediscovering what already exists.It started as a document indexer in Rust, built on Tantivy for proper ranked full-text search.T

Show HN: LLMWise – Compare, Blend, and Judge LLM Outputs from One API

The core idea is that no single LLM is best at everything, so we built orchestration primitives that let you combine them intelligently via a single API.Mixture-of-Agents (MoA): Our /blend endpoint implements multi-layer MoA. You send a prompt to 2-6 models in parallel, then each model refines its answer using the other models' outputs as reference material. This runs for 1-3 configurable layers before a synthesizer model produces the final response. We also built a Self-MoA variant: a

Local iOS voice to text app (alternative to Wispr Flow)

I usually dictate for 2 to 3 hours everyday in Dragon dictation and until recently used Wispr Flow on my personal devices. Over the last few months, I realized that local Al models can give you the same quality as Wispr Flow with complete privacy and without the ongoing subscription cost. So I built an iOS app, a MacOS app and an Android app.Testflight link:https://testflight.apple.com/join/e5pcxwyqI am happy to offer the app for free to people who offer useful feedback for t

Show HN: CRTX – AI code gen that tests and fixes its own output (OSS)

We built an open-source CLI that generates code, runs tests, fixes failures, and gets an independent AI review — all before you see the output. We started with a multi-model pipeline where different AI models handled different stages (architect, implement, refactor, verify). We assumed more models meant better code. Then we benchmarked it: 39% average quality score at $4.85 per run. A single model scored 94% at $0.36. Our pipeline was actively making things worse. So we killed it and rebuilt aro