Buy A Modem
Ask HN: Anyone else feel this community has changed recently?
I've been on HN under different aliases since 2010 and over the last couple of years I feel like the quality of HN has nosed dived and so has my enjoyment.For the first time ever I questioned today whether I should continue to use HN anymore so I'm writing this partly to explore my own thoughts and to see if anyone else feels similarly.1. AI, AI, AI.I get it. AI is the big thing right now, but I find AI posts fundamentally less interesting than the traditional tech content that used to
Show HN: The Roman Industrial Revolution that could have been (Vol 2)
A few months ago I shared the first issue of The Lydian Stone Series here:https://news.ycombinator.com/item?id=44253083It's an alternate-history comic about an archaeology student in modern Pompeii who discovers a slate that lets him exchange short messages with a Roman slave a week before the eruption of Vesuvius.The premise is simple: what happens if someone in the Roman world suddenly gains access to modern scientific knowledge, but still has to build everything using the
Show HN: MultiPowerAI – Trust and accountability infrastructure for AI agents
Been shipping agent systems for a while and kept running into the same wall - once an agent's deployed, you're basically flying blind. No way to prove what it did, no automatic killswitch if it goes sideways, nothing.Built MultiPowerAI to fix that. The core stuff: cryptographic identity per agent, behavioral circuit breakers that auto-suspend if something looks off, human approval queues before high-stakes actions, and a full audit trail so every action is signed and timestamped.Also t
Show HN: Cross-Claude MCP – Let multiple Claude instances talk to each other
I built an MCP server that lets Claude AI instances communicate through a shared message bus. Each instance registers with a name, then they can send messages, create channels, share data, and wait for replies — like a lightweight Slack for AI sessions.The problem it solves: if you use Claude Code in multiple terminals (or across Claude.ai and Desktop), each session is completely isolated. There's no way for one Claude to ask another for help, delegate work, or coordinate on a shared task.W
Show HN: OpenEHR-CLI – CLI and MCP server for working with openEHR artifacts
Hi HN,I built openEHR-CLI, an open source command line tool to work with openEHR artifacts (archetypes, templates, etc.).The idea was to make it easier to automate tasks that usually require GUI tools, such as validating templates or processing openEHR resources in scripts and CI pipelines.One interesting feature is that the CLI also exposes an MCP (Model Context Protocol) server. This allows the tool to be used by AI clients that support MCP (Claude Desktop, Cursor, etc.), so AI assistants can
Show HN: EdgeDox – Offline document AI on Android using Qwen3.5-0.8B
Hi HN,I’ve been experimenting with running small language models directly on mobile devices and built a small Android app called EdgeDox.The idea was to make document AI usable without sending files to a cloud service. Many existing tools require uploading PDFs or documents to a server, which can be a privacy concern.EdgeDox runs a lightweight language model (Qwen3.5-0.8B) locally on the device so documents stay on the phone.Current features:• Ask questions about PDFs
• Document summarization
•
Show HN: Claude-consensus – Multi-model code review plugin for Claude Code
It's a Claude Code plugin that runs multiple AI models (GPT, Gemini, Grok, Kimi, Qwen, etc.) in parallel for code review and planning, then converges them on consensus through structured rounds.Each model reviews independently with no visibility into what the others found. Then they synthesize, surface conflicts, and run convergence (approve / changes needed, max 2 rounds).Technically it's markdown command files orchestrating Claude Code's team system — no custom runtime, jus
Show HN: OpenGraviton – Run 500B+ parameter models on a consumer Mac Mini
Hi HN,I built OpenGraviton, an open-source AI inference engine designed to push the limits of running extremely large models on consumer hardware.The system combines several techniques to drastically reduce memory and compute requirements:• 1.58-bit ternary quantization ({-1, 0, +1}) for ~10x compression
• dynamic sparsity with Top-K pruning and MoE routing
• mmap-based layer streaming to load weights directly from NVMe SSDs
• speculative decoding to improve generation throughputThese allow mode
Way to Use AI for Coding
A mistake I see many beginner developers make with AI coding tools is this:They ask the AI to build the entire project.Something like:“Build me a full SaaS app that does this”The result?A messy codebase.AI tries to generate everything at once, and the architecture usually falls apart.After experimenting with AI coding tools, I’ve found a workflow that works much better.Think Like an Architect, Not a Prompt EngineerThe key idea is simple:You design the system.
AI helps implement pieces of it.Inst
Show HN: I built Asterode, a multi model AI app with memory and power features
I built Asterode to make AI more productive on the go. It focuses on continuity: switching models mid conversation, branching chats, a global memory layer, and keeping useful context without constantly starting over. IT's available on Android and iOS. I’d really appreciate thoughtful feedback.
Show HN: Fingerprinting Text Embedding Models via Floating-Point Artifacts
Implemented a sliding window based mean n-gram histogram vector solution for fingerprinting embedding models after coming across the post[1] below by Han Xiao of Jina AI and it surprisingly worked way better than I expected! Link to Colab notebook [2] and a quick visualization [3] below.I had this idea couple of years ago but couldn't get myself to work on it. Seeing the post got me thinking about it again and I was pleasantly surprised at the results.1 - https://jina.ai/news
Show HN: Morph – Videos of AI testing your PR, embedded in GitHub
I review PRs all day and I've basically stopped reading them. Someone opens a 2000-line PR, I scroll, see it's mostly AI-generated React components, leave a comment, merge. I felt bad about it until I realized everyone on my team does the same thing.The problem is diffs are the wrong format. A PR might change how three buttons behave. Staring at green and red lines to understand that is crazy.The core reason we built this is that we feel that products today are built with assumptions f
Show HN: Demucs music stem separator rewritten in Rust – runs in the browser
Hi HN! I reimplemented HTDemucs v4 (Meta's music source separation model) in Rust, using Burn. It splits any song into individual stems — drums, bass, vocals, guitar, piano — with no Python runtime or server involved.Try it now: https://nikhilunni.github.io/demucs-rs/ (needs a WebGPU-capable browser — Chrome/Edge work best)GitHub: https://github.com/nikhilunni/demucs-rsIt runs three ways:- In the browser — the full ML inference pipeline compiles
Show HN: Pencil Puzzle Bench – LLM Benchmark for Multi-Step Verifiable Reasoning
I've been working on applying LLMs to long-context, verifiable problems over the past year, and today I'm releasing a benchmark of 62,000 pencil puzzles across 94 types (sudoku, nonori, slitherlink, etc.). The benchmark also allows for intermediate checks /rule breaks for all varieties at any step.I tested 51 models against a subset (300 puzzles) in two modes: single-shot (output the full solution) and agentic (iterate with verifier feedback).Some results:- Best model (GPT 5.2@xh
Show HN: I built a LLM human rights evaluator for HN (content vs. site behavior)
I built Observatory to automatically evaluate Hacker News front-page stories against all 31 provisions of the UN Universal Declaration of Human Rights — starting with HN because its human-curated front page is one of the few feeds where a story's presence signals something about quality, not just virality. It runs every minute: https://observatory.unratified.org. Claude Haiku 4.5 handles full evaluations; Llama 4 Scout and Llama 3.3 70B on Workers AI run a lighter free-tier pass.M
Ask HN: What is the state of prompt injection attacks and best practices?
I am curious about the state of prompt injection attacks on frontier models. Are they still vulnerable? For example, is it safe to let Claude Code look at user-submitted data if it also helps manage some of the infrastructure or code? Can they just be asked to identify prompt injection attacks and flag and ignore them, or do injection attacks change the models' behavior despite the owner's prompts? What are best practices?
Show HN: My colleague said my prompts were unreadable. I built a prompt builder
Last week I started using Claude Code. My colleague, who has been prompting AI models
for months, looked at what I was sending and said he had no idea what I was asking for.If an experienced user couldn't parse it, the model definitely wasn't getting the best version of it either.So I built flompt. The idea is simple: instead of writing a prompt as a wall of text,
you decompose it into typed visual blocks (role, context, objective, constraints, examples,
output format), arrange them, a
Show HN: Augur – A text RPG boss fight where the boss learns across encounters
I've been building Augur as a solo side project for the last month or so. It started as an experiment to see if I could make "boss fight" that learned from all comers, but still felt genuinely fair to play. The original plan was to build a simplistic jrpg style turned-based encounter engine, but I quickly pivoted to a text based interface, recalling my early experiences with Adventure and Zork. That naturally led to incorporating an LLM, and it turned into something I find pretty
Tell HN: We modeled the cost of boilerplate (it's ~80% of the budget)
We spent the last month modeling software budgets to figure out why velocity often feels so low even with senior teams. The short answer seems to be structural: about 80% of engineering time goes to non-differentiating infrastructure (auth, pipelines, CRUD) rather than unique business logic.We call it the "Infrastructure Tax." We analyzed an anonymized $2.4M engineering spend, and honestly, the breakdown was depressing. Only about 20% of that budget went to features that actually diffe
Show HN: Liftstack – Snippet-level A/B testing for CRM marketers
I've spent years in CRM and email marketing, and one thing has always driven me mad: the constant pressure from the business to "test everything" when you know damn well you'll never reach statistical significance.Most ESP's use frequentist models. You need a fixed sample size calculated upfront, you can't peek at results early without inflating your false positive rate, and if your list isn't massive, you're waiting weeks for a result that often comes bac