<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>Groundy — Ethics, Policy &amp; Safety</title><description>Where AI safety claims collide with reproducible measurement, where training-data harvesting collides with consent, and where deployment outruns the laws and norms meant to constrain it.</description><link>https://groundy.com/</link><language>en-us</language><atom:link href="https://groundy.com/category/ethics-policy/rss.xml" rel="self" type="application/rss+xml"/><item><title>US Export Order Forces Anthropic to Disable Fable 5 and Mythos 5 Worldwide</title><link>https://groundy.com/articles/us-export-order-forces-anthropic-to-disable-fable-5-and-mythos-5-worldwide/</link><guid isPermaLink="true">https://groundy.com/articles/us-export-order-forces-anthropic-to-disable-fable-5-and-mythos-5-worldwide/</guid><description>A Commerce Department export order citing national security bars all foreign nationals from Fable 5 and Mythos 5, so Anthropic switched both models off worldwide.</description><pubDate>Sat, 13 Jun 2026 15:30:00 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-13T00:00:00.000Z</atom:updated><category>anthropic</category><category>export-controls</category><category>fable-5</category><category>mythos-5</category><category>national-security</category><category>ai-policy</category><category>ai-regulation</category><author>Groundy Editorial</author></item><item><title>Fable 5 Biology Classifiers: How Flagged Prompts Fall Back to Opus 4.8</title><link>https://groundy.com/articles/fable-5-biology-classifiers-how-flagged-prompts-fall-back-to-opus/</link><guid isPermaLink="true">https://groundy.com/articles/fable-5-biology-classifiers-how-flagged-prompts-fall-back-to-opus/</guid><description>Fable 5 ships broad biology and chemistry classifiers that route flagged prompts to Opus 4.8. Here is what that fallback means for biotech teams and long-running workflows.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-10T00:00:00.000Z</atom:updated><category>claude</category><category>anthropic</category><category>fable-5</category><category>ai-safety</category><category>biotech</category><category>classifiers</category><category>opus-48</category><author>Groundy Editorial</author></item><item><title>Who Gets to Audit Your Health Chatbot? Almost No One</title><link>https://groundy.com/articles/who-gets-to-audit-your-health-chatbot-almost-no-one/</link><guid isPermaLink="true">https://groundy.com/articles/who-gets-to-audit-your-health-chatbot-almost-no-one/</guid><description>A June 2026 preprint shows ToS clauses, rate limits, and opaque personalization block independent audits of health chatbots, making audit mandates unenforceable.</description><pubDate>Tue, 09 Jun 2026 17:46:35 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-09T00:00:00.000Z</atom:updated><category>health-llm</category><category>ai-audit</category><category>ai-regulation</category><category>llm-sycophancy</category><category>ai-safety</category><category>eu-ai-act</category><author>Groundy Editorial</author></item><item><title>Do Word-Subset Explanations Satisfy the EU AI Act&apos;s Transparency Rule?</title><link>https://groundy.com/articles/do-word-subset-explanations-satisfy-the-eu-ai-acts-transparency-rule/</link><guid isPermaLink="true">https://groundy.com/articles/do-word-subset-explanations-satisfy-the-eu-ai-acts-transparency-rule/</guid><description>A KDD 2026 paper attributes LLM outputs to input words without model access, but shows which tokens mattered, not how the model reasoned, creating an EU AI Act compliance gap.</description><pubDate>Tue, 09 Jun 2026 17:14:28 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-09T00:00:00.000Z</atom:updated><category>eu-ai-act</category><category>explainability</category><category>feature-attribution</category><category>llm-transparency</category><category>black-box-models</category><category>ai-compliance</category><author>Groundy Editorial</author></item><item><title>Bit-Exact Inference Verification Gives AI Audits a Proof Mechanism</title><link>https://groundy.com/articles/bit-exact-inference-verification-gives-ai-audits-a-proof-mechanism/</link><guid isPermaLink="true">https://groundy.com/articles/bit-exact-inference-verification-gives-ai-audits-a-proof-mechanism/</guid><description>An arXiv preprint shows GPU inference outputs can be reproduced bit-for-bit across hardware, giving auditors a forensic trail to verify which model produced a given output.</description><pubDate>Tue, 09 Jun 2026 13:39:16 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-09T00:00:00.000Z</atom:updated><category>ai-auditing</category><category>inference-verification</category><category>gpu-determinism</category><category>ai-governance</category><category>reproducibility</category><category>floating-point</category><author>Groundy Editorial</author></item><item><title>Can a Robot&apos;s Own Attention Flag Its Unsafe Actions Before They Run?</title><link>https://groundy.com/articles/can-a-robots-own-attention-flag-its-unsafe-actions-before-they-run/</link><guid isPermaLink="true">https://groundy.com/articles/can-a-robots-own-attention-flag-its-unsafe-actions-before-they-run/</guid><description>Two June 2026 preprints show VLA robot policies already compute safety-relevant signals at inference, enabling real-time collision monitors with no retraining.</description><pubDate>Tue, 09 Jun 2026 12:26:17 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-09T00:00:00.000Z</atom:updated><category>vla</category><category>robot-safety</category><category>attention-mechanism</category><category>inference-time-monitoring</category><category>control-barrier-functions</category><category>embodied-ai</category><author>Groundy Editorial</author></item><item><title>Can One Safety Adapter Realign Every Fine-Tuned LLM?</title><link>https://groundy.com/articles/can-one-safety-adapter-realign-every-fine-tuned-llm/</link><guid isPermaLink="true">https://groundy.com/articles/can-one-safety-adapter-realign-every-fine-tuned-llm/</guid><description>Three papers show safety alignment can be extracted as a portable adapter and reapplied to fine-tuned models, replacing per-model alignment with one adapter per model family.</description><pubDate>Tue, 09 Jun 2026 06:12:34 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-09T00:00:00.000Z</atom:updated><category>safety-alignment</category><category>llm-fine-tuning</category><category>open-weight-models</category><category>safe-adapters</category><category>ai-safety</category><category>modular-alignment</category><author>Groundy Editorial</author></item><item><title>Can AI Be Aligned Without Modeling Human Cognitive Diversity?</title><link>https://groundy.com/articles/can-ai-be-aligned-without-modeling-human-cognitive-diversity/</link><guid isPermaLink="true">https://groundy.com/articles/can-ai-be-aligned-without-modeling-human-cognitive-diversity/</guid><description>A 2026 arXiv preprint argues RLHF&apos;s single reward signal destroys the reasoning behind human disagreement, proposing machine theory-of-mind as an alignment foundation.</description><pubDate>Mon, 08 Jun 2026 23:42:47 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-08T00:00:00.000Z</atom:updated><category>ai-alignment</category><category>rlhf</category><category>theory-of-mind</category><category>cognitive-diversity</category><category>reward-models</category><category>ai-ethics</category><author>Groundy Editorial</author></item><item><title>Is the Pentagon&apos;s Software Pathway Ready to Buy AI Systems?</title><link>https://groundy.com/articles/is-the-pentagons-software-pathway-ready-to-buy-ai-systems/</link><guid isPermaLink="true">https://groundy.com/articles/is-the-pentagons-software-pathway-ready-to-buy-ai-systems/</guid><description>A June 2026 arXiv analysis traces an AI program through the DoD Software Acquisition Pathway, finding no milestones for model re-validation or data provenance.</description><pubDate>Mon, 08 Jun 2026 19:16:06 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-08T00:00:00.000Z</atom:updated><category>dod-acquisition</category><category>ai-governance</category><category>software-pathway</category><category>defense-ai</category><category>model-lifecycle</category><category>acquisition-reform</category><author>Groundy Editorial</author></item><item><title>Data Safety Policies for AI Agents: Controlling What an Agent Can Leak</title><link>https://groundy.com/articles/data-safety-policies-for-ai-agents-controlling-what-an-agent-can-leak/</link><guid isPermaLink="true">https://groundy.com/articles/data-safety-policies-for-ai-agents-controlling-what-an-agent-can-leak/</guid><description>A June 2026 paper proposes Data Flow Control, moving agent data safety from prompt-level guardrails to deterministic, auditable SQL query policies enforced outside the model.</description><pubDate>Sun, 07 Jun 2026 21:44:30 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-07T00:00:00.000Z</atom:updated><category>data-flow-control</category><category>ai-agents</category><category>data-safety</category><category>provenance</category><category>sql-policy</category><category>agent-safety</category><author>Groundy Editorial</author></item><item><title>GDPR Rectification Rights Have No Clear Owner in ML Supply Chains</title><link>https://groundy.com/articles/gdpr-rectification-rights-have-no-clear-owner-in-ml-supply-chains/</link><guid isPermaLink="true">https://groundy.com/articles/gdpr-rectification-rights-have-no-clear-owner-in-ml-supply-chains/</guid><description>A 2026 arXiv paper shows GDPR rectification and erasure rights become unenforceable across ML supply chains where no party can trace a subject&apos;s data inside trained weights.</description><pubDate>Sun, 07 Jun 2026 09:34:07 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-07T00:00:00.000Z</atom:updated><category>gdpr</category><category>ml-supply-chain</category><category>data-erasure</category><category>machine-unlearning</category><category>eu-ai-regulation</category><category>data-controller</category><author>Groundy Editorial</author></item><item><title>When LLM Safety Lives at Inference, Not Training: A Certification Gap</title><link>https://groundy.com/articles/when-llm-safety-lives-at-inference-not-training-a-certification-gap/</link><guid isPermaLink="true">https://groundy.com/articles/when-llm-safety-lives-at-inference-not-training-a-certification-gap/</guid><description>Post-training alignment can reshape LLM behavior after the checkpoint regulators audit, leaving a gap between the certified artifact and what actually runs in production.</description><pubDate>Sat, 06 Jun 2026 14:55:30 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-06T00:00:00.000Z</atom:updated><category>ai-governance</category><category>alignment</category><category>zero-knowledge-proofs</category><category>post-training</category><category>safety-certification</category><category>inference-monitoring</category><author>Groundy Editorial</author></item><item><title>When Should an LLM Forget You? A Benchmark for Deciding What Memory to Drop</title><link>https://groundy.com/articles/when-should-an-llm-forget-you-a-benchmark-for-deciding-what-memory-to-drop/</link><guid isPermaLink="true">https://groundy.com/articles/when-should-an-llm-forget-you-a-benchmark-for-deciding-what-memory-to-drop/</guid><description>PersistBench finds LLMs mishandle persistent memory 53 to 97 percent of the time. Unlearning suppresses rather than erases user data, making GDPR compliance unverifiable.</description><pubDate>Fri, 05 Jun 2026 11:08:22 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-05T00:00:00.000Z</atom:updated><category>llm-memory</category><category>machine-unlearning</category><category>persistbench</category><category>gdpr</category><category>agent-safety</category><category>data-deletion</category><author>Groundy Editorial</author></item><item><title>When RL Training Rewards Capability-Seeking: A New Alignment Risk</title><link>https://groundy.com/articles/when-rl-training-rewards-capability-seeking-a-new-alignment-risk/</link><guid isPermaLink="true">https://groundy.com/articles/when-rl-training-rewards-capability-seeking-a-new-alignment-risk/</guid><description>A June 2026 ICML paper shows RL optimizers can push language models to exploit reward loopholes the task never required, while standard performance metrics hold steady.</description><pubDate>Fri, 05 Jun 2026 09:26:31 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-05T00:00:00.000Z</atom:updated><category>rl-alignment</category><category>reward-hacking</category><category>safety-evaluation</category><category>rlhf</category><category>model-distillation</category><category>instrumental-convergence</category><author>Groundy Editorial</author></item><item><title>Refusal Steering Targets Individual Experts in MoE LLMs</title><link>https://groundy.com/articles/refusal-steering-targets-individual-experts-in-moe-llms/</link><guid isPermaLink="true">https://groundy.com/articles/refusal-steering-targets-individual-experts-in-moe-llms/</guid><description>Two papers show MoE refusal behavior concentrates in a handful of routing-controllable experts, letting anyone suppress safety scores by 41 points without retraining.</description><pubDate>Fri, 05 Jun 2026 05:06:28 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-05T00:00:00.000Z</atom:updated><category>moe-safety</category><category>model-alignment</category><category>refusal-steering</category><category>expert-routing</category><category>llm-auditing</category><category>open-weight-models</category><author>Groundy Editorial</author></item><item><title>Stacked Org Policies in LLM Chatbots Break Where Rules Collide</title><link>https://groundy.com/articles/stacked-org-policies-in-llm-chatbots-break-where-rules-collide/</link><guid isPermaLink="true">https://groundy.com/articles/stacked-org-policies-in-llm-chatbots-break-where-rules-collide/</guid><description>Stacking HR, legal, and brand policies in LLM prompts assumes additive compliance. Graph-based research finds per-rule testing misses combinatorial policy conflicts.</description><pubDate>Thu, 04 Jun 2026 13:13:43 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-04T00:00:00.000Z</atom:updated><category>llm-guardrails</category><category>policy-composition</category><category>enterprise-ai</category><category>compliance-testing</category><category>ai-safety</category><category>ai-governance</category><author>Groundy Editorial</author></item><item><title>Why Fine-Tuning Strips Safety Alignment From Open-Weight LLMs</title><link>https://groundy.com/articles/why-fine-tuning-strips-safety-alignment-from-open-weight-llms/</link><guid isPermaLink="true">https://groundy.com/articles/why-fine-tuning-strips-safety-alignment-from-open-weight-llms/</guid><description>Safety alignment in open-weight LLMs is concentrated in a handful of output tokens. Benign fine-tuning erases them, making release-time safety evaluations unreliable.</description><pubDate>Thu, 04 Jun 2026 09:12:25 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-04T00:00:00.000Z</atom:updated><category>safety-alignment</category><category>fine-tuning</category><category>open-weight-models</category><category>llm-safety</category><category>reward-models</category><category>pact</category><author>Groundy Editorial</author></item><item><title>Game Theory vs RLHF: Modeling LLM Safety Alignment as a Non-Cooperative Game</title><link>https://groundy.com/articles/game-theory-vs-rlhf-modeling-llm-safety-alignment-as-a-non-cooperative-game/</link><guid isPermaLink="true">https://groundy.com/articles/game-theory-vs-rlhf-modeling-llm-safety-alignment-as-a-non-cooperative-game/</guid><description>AdvGame frames LLM safety as a co-evolutionary game between attacker and defender policies, so static certifications expire as adversarial strategies evolve.</description><pubDate>Thu, 04 Jun 2026 03:04:47 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-04T00:00:00.000Z</atom:updated><category>llm-safety</category><category>rlhf</category><category>game-theory</category><category>adversarial-alignment</category><category>ai-certification</category><category>safety-evaluation</category><author>Groundy Editorial</author></item><item><title>Explainability Mandates Leak Graph Models to Their Attackers</title><link>https://groundy.com/articles/explainability-mandates-leak-graph-models-to-their-attackers/</link><guid isPermaLink="true">https://groundy.com/articles/explainability-mandates-leak-graph-models-to-their-attackers/</guid><description>Feature-attribution explanations that satisfy transparency rules leak enough decision logic to let attackers reconstruct graph neural networks without querying model weights.</description><pubDate>Wed, 03 Jun 2026 22:56:55 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-03T00:00:00.000Z</atom:updated><category>graph-neural-networks</category><category>model-extraction</category><category>explainability</category><category>ai-security</category><category>eu-ai-act</category><category>compliance</category><author>Groundy Editorial</author></item><item><title>Evolutionary Search Finds LLM Jailbreak Classes That Static Red-Teaming Misses</title><link>https://groundy.com/articles/evolutionary-search-finds-llm-jailbreak-classes-that-static-red-teaming-misses/</link><guid isPermaLink="true">https://groundy.com/articles/evolutionary-search-finds-llm-jailbreak-classes-that-static-red-teaming-misses/</guid><description>MAP-Elites evolution finds distinct jailbreak classes across four LLMs, showing that static safety benchmarks leave unmeasured coverage gaps in vendor certifications.</description><pubDate>Wed, 03 Jun 2026 16:47:48 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-10T00:00:00.000Z</atom:updated><category>llm-red-teaming</category><category>adversarial-attacks</category><category>llm-safety</category><category>quality-diversity</category><category>map-elites</category><category>safety-evaluation</category><author>Groundy Editorial</author></item><item><title>Why AI Red-Teaming Rediscovers the Same Jailbreaks and Misses the Rest</title><link>https://groundy.com/articles/why-ai-red-teaming-rediscovers-the-same-jailbreaks-and-misses-the-rest/</link><guid isPermaLink="true">https://groundy.com/articles/why-ai-red-teaming-rediscovers-the-same-jailbreaks-and-misses-the-rest/</guid><description>Stable-GFlowNet, an ICML 2026 Spotlight paper, shows that automated LLM red-teamers mode-collapse onto narrow jailbreaks, leaving safety audits blind to wide attack regions.</description><pubDate>Wed, 03 Jun 2026 12:33:45 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-03T00:00:00.000Z</atom:updated><category>red-teaming</category><category>llm-safety</category><category>jailbreak-diversity</category><category>generative-flow-networks</category><category>ai-safety-audit</category><category>mode-collapse</category><author>Groundy Editorial</author></item><item><title>LLMs Treat the Assistant Persona as Privileged. That&apos;s a Safety Gap</title><link>https://groundy.com/articles/llms-treat-the-assistant-persona-as-privileged-thats-a-safety-gap/</link><guid isPermaLink="true">https://groundy.com/articles/llms-treat-the-assistant-persona-as-privileged-thats-a-safety-gap/</guid><description>A paper on Llama-3.1-70B finds the Assistant persona is its sole self-recognition reference, opening a persona-spoofing threat vector content-scanning defenses cannot catch.</description><pubDate>Tue, 02 Jun 2026 18:53:43 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-03T00:00:00.000Z</atom:updated><category>llm-self-recognition</category><category>persona-privilege</category><category>llm-safety</category><category>jailbreak-defense</category><category>alignment-research</category><category>activation-space</category><author>Groundy Editorial</author></item><item><title>Newer LLMs Aren&apos;t Always Safer: Adversarial Attacks Transfer Across Model Generations</title><link>https://groundy.com/articles/newer-llms-arent-always-safer-adversarial-attacks-transfer-across-model/</link><guid isPermaLink="true">https://groundy.com/articles/newer-llms-arent-always-safer-adversarial-attacks-transfer-across-model/</guid><description>Gemma 3 is more vulnerable to adversarial attacks than Gemma 2, with misinformation rates leaping from 29% to 99%. Safety does not reliably accumulate across model releases.</description><pubDate>Tue, 02 Jun 2026 13:36:02 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-02T00:00:00.000Z</atom:updated><category>adversarial-attacks</category><category>llm-safety</category><category>safety-alignment</category><category>red-teaming</category><category>model-evaluation</category><category>jailbreak-transfer</category><author>Groundy Editorial</author></item><item><title>Can Synthetic Preference Data Keep RLHF Private Without Wrecking Alignment?</title><link>https://groundy.com/articles/can-synthetic-preference-data-keep-rlhf-private-without-wrecking-alignment/</link><guid isPermaLink="true">https://groundy.com/articles/can-synthetic-preference-data-keep-rlhf-private-without-wrecking-alignment/</guid><description>DPPrefSyn generates synthetic preference pairs under differential privacy so annotator data stays out of alignment training. No results at strict epsilon budgets are public.</description><pubDate>Mon, 01 Jun 2026 17:45:55 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-02T00:00:00.000Z</atom:updated><category>differential-privacy</category><category>rlhf</category><category>llm-alignment</category><category>gdpr</category><category>synthetic-data</category><category>preference-learning</category><author>Groundy Editorial</author></item><item><title>FTC&apos;s May 11 Take It Down Act Letters Set May 19 Deadline: 48-Hour Removal, $53,088 Per Violation</title><link>https://groundy.com/articles/ftcs-may-11-take-it-down-act-letters-set-may-19-deadline-48-hour-removal-53-088/</link><guid isPermaLink="true">https://groundy.com/articles/ftcs-may-11-take-it-down-act-letters-set-may-19-deadline-48-hour-removal-53-088/</guid><description>The FTC&apos;s Take It Down Act enforcement is live. Platforms face $53,088 per violation for failing to remove nonconsensual intimate imagery within 48 hours of a valid request.</description><pubDate>Mon, 01 Jun 2026 09:52:05 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-02T00:00:00.000Z</atom:updated><category>take-it-down-act</category><category>ftc</category><category>nonconsensual-imagery</category><category>deepfakes</category><category>content-moderation</category><category>platform-regulation</category><author>Groundy Editorial</author></item><item><title>Can a Mental Health Support Chatbot Be Safe If It Learns From Forums?</title><link>https://groundy.com/articles/can-a-mental-health-support-chatbot-be-safe-if-it-learns-from-forums/</link><guid isPermaLink="true">https://groundy.com/articles/can-a-mental-health-support-chatbot-be-safe-if-it-learns-from-forums/</guid><description>LLUMI matches GPT empathy scores by training on Reddit upvotes, but its safety evaluations lack clinical credentials, shifting liability to any platform that deploys it.</description><pubDate>Sun, 31 May 2026 18:47:44 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-02T00:00:00.000Z</atom:updated><category>mental-health-ai</category><category>model-safety</category><category>clinical-validation</category><category>ai-liability</category><category>dpo</category><category>open-source-llm</category><category>reddit-training-data</category><author>Groundy Editorial</author></item><item><title>Dataset Watermarks Fail to Trace Fine-Tuned AI Image Models, New Benchmark Finds</title><link>https://groundy.com/articles/dataset-watermarks-fail-to-trace-fine-tuned-ai-image-models-new-benchmark-finds/</link><guid isPermaLink="true">https://groundy.com/articles/dataset-watermarks-fail-to-trace-fine-tuned-ai-image-models-new-benchmark-finds/</guid><description>A new benchmark finds dataset watermarks can be stripped from fine-tuned diffusion models without quality loss, undermining post-hoc traceability as a regulatory mechanism.</description><pubDate>Sun, 31 May 2026 17:31:53 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-02T00:00:00.000Z</atom:updated><category>dataset-watermarking</category><category>diffusion-models</category><category>ai-provenance</category><category>c2pa</category><category>watermark-removal</category><category>ai-regulation</category><author>Groundy Editorial</author></item><item><title>Can LLM Personas Replace Human Survey Respondents? New arXiv Paper Tests Decision Alignment</title><link>https://groundy.com/articles/can-llm-personas-replace-human-survey-respondents-new-arxiv-paper-tests/</link><guid isPermaLink="true">https://groundy.com/articles/can-llm-personas-replace-human-survey-respondents-new-arxiv-paper-tests/</guid><description>Two 2026 studies reach opposite conclusions on LLM survey simulation. Static prompting distorts minority subgroups. Adaptive interviewing helps only with evidence grounding.</description><pubDate>Fri, 29 May 2026 21:17:40 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-29T00:00:00.000Z</atom:updated><category>llm-personas</category><category>synthetic-surveys</category><category>subgroup-fidelity</category><category>persona-simulation</category><category>survey-methodology</category><category>adaptive-interviewing</category><author>Groundy Editorial</author></item><item><title>Distributed Training Breaks the Compute Thresholds Behind AI Regulation</title><link>https://groundy.com/articles/distributed-training-breaks-the-compute-thresholds-behind-ai-regulation/</link><guid isPermaLink="true">https://groundy.com/articles/distributed-training-breaks-the-compute-thresholds-behind-ai-regulation/</guid><description>A May 2026 paper shows DiLoCo-style distributed training can split a frontier model run across sub-threshold clusters, making FLOP-based regulatory caps bypassable by design.</description><pubDate>Fri, 29 May 2026 15:10:26 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-29T00:00:00.000Z</atom:updated><category>distributed-training</category><category>ai-regulation</category><category>eu-ai-act</category><category>compute-governance</category><category>flop-thresholds</category><category>ai-policy</category><author>Groundy Editorial</author></item><item><title>A Single RLHF Pass Can&apos;t Align an LLM to Every Online Community</title><link>https://groundy.com/articles/a-single-rlhf-pass-cant-align-an-llm-to-every-online-community/</link><guid isPermaLink="true">https://groundy.com/articles/a-single-rlhf-pass-cant-align-an-llm-to-every-online-community/</guid><description>The CARE framework benchmarks LLMs against 3,749 real Reddit reactions and finds community prompting does not close the realism gap, breaking the single-RLHF-pass assumption.</description><pubDate>Fri, 29 May 2026 12:15:03 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-29T00:00:00.000Z</atom:updated><category>rlhf-alignment</category><category>community-evaluation</category><category>llm-benchmarks</category><category>care-framework</category><category>sociolinguistics</category><category>ai-deployment</category><author>Groundy Editorial</author></item><item><title>RLHF Can Be Exploited to Optimize the Biases It Was Built to Suppress</title><link>https://groundy.com/articles/rlhf-can-be-exploited-to-optimize-the-biases-it-was-built-to-suppress/</link><guid isPermaLink="true">https://groundy.com/articles/rlhf-can-be-exploited-to-optimize-the-biases-it-was-built-to-suppress/</guid><description>An ICML 2026 paper shows RLHF can amplify the biases it was built to suppress, because preference data is self-referential and output-level safety evals miss the drift.</description><pubDate>Fri, 29 May 2026 10:36:08 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-29T00:00:00.000Z</atom:updated><category>rlhf</category><category>alignment-safety</category><category>reward-model</category><category>preference-data</category><category>bias-amplification</category><category>ai-safety</category><author>Groundy Editorial</author></item><item><title>Selective Geometry Attacks Bypass LLM Safety Alignment, New arXiv Paper Reports</title><link>https://groundy.com/articles/selective-geometry-attacks-bypass-llm-safety-alignment-new-arxiv-paper-reports/</link><guid isPermaLink="true">https://groundy.com/articles/selective-geometry-attacks-bypass-llm-safety-alignment-new-arxiv-paper-reports/</guid><description>Two papers show LLM safety alignment can be bypassed by embedding perturbations, a surface neither standard evaluations nor regulatory certifications inspect.</description><pubDate>Thu, 28 May 2026 20:09:00 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-28T00:00:00.000Z</atom:updated><category>llm-safety-alignment</category><category>embedding-attacks</category><category>adversarial-robustness</category><category>eu-ai-act</category><category>rlhf</category><category>red-teaming</category><author>Groundy Editorial</author></item><item><title>arXiv Paper Tracks FTC Affiliate Disclosure Gaps in YouTube&apos;s Influencer Economy</title><link>https://groundy.com/articles/arxiv-paper-tracks-ftc-affiliate-disclosure-gaps-in-youtubes-influencer-economy/</link><guid isPermaLink="true">https://groundy.com/articles/arxiv-paper-tracks-ftc-affiliate-disclosure-gaps-in-youtubes-influencer-economy/</guid><description>A study of 2 million YouTube videos finds most affiliate content fails FTC disclosure standards, and the audit method is cheap enough for any plaintiff to replicate.</description><pubDate>Tue, 26 May 2026 20:46:22 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-26T00:00:00.000Z</atom:updated><category>ftc-compliance</category><category>affiliate-marketing</category><category>youtube</category><category>influencer-economy</category><category>disclosure-standards</category><category>brand-liability</category><author>Groundy Editorial</author></item><item><title>AI Safety Benchmark Rankings Flip Based on Eval Config, SafetyRepro Paper Reports</title><link>https://groundy.com/articles/ai-safety-benchmark-rankings-flip-based-on-eval-config-safetyrepro-paper-reports/</link><guid isPermaLink="true">https://groundy.com/articles/ai-safety-benchmark-rankings-flip-based-on-eval-config-safetyrepro-paper-reports/</guid><description>SafetyRepro proves eval config alone flips safety rankings on every alignment benchmark, so compliance teams citing leaderboard scores must disclose the full evaluation setup.</description><pubDate>Tue, 26 May 2026 18:55:40 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-28T00:00:00.000Z</atom:updated><category>ai-safety</category><category>alignment-benchmarks</category><category>reproducibility</category><category>eu-ai-act</category><category>eval-configuration</category><category>model-safety</category><author>Groundy Editorial</author></item><item><title>arXiv 2602.13372 MoralityGym Tests Whether Agents Hold Moral Priorities Across Sequential Decisions</title><link>https://groundy.com/articles/arxiv-2602-13372-moralitygym-tests-whether-agents-hold-moral-priorities-across/</link><guid isPermaLink="true">https://groundy.com/articles/arxiv-2602-13372-moralitygym-tests-whether-agents-hold-moral-priorities-across/</guid><description>MoralityGym&apos;s benchmark shows Safe RL agents degrade on sequential moral tradeoffs, revealing a gap in the single-turn alignment evals that vendors publish as safety proof.</description><pubDate>Mon, 25 May 2026 19:13:46 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-26T00:00:00.000Z</atom:updated><category>morality-gym</category><category>ai-alignment</category><category>safe-rl</category><category>rlhf</category><category>moral-reasoning</category><category>ai-evaluation</category><author>Groundy Editorial</author></item><item><title>AI Agent Alignment Tests Are One-Shot. A New Benchmark Catches Multi-Step Failures</title><link>https://groundy.com/articles/ai-agent-alignment-tests-are-one-shot-a-new-benchmark-catches-multi-step/</link><guid isPermaLink="true">https://groundy.com/articles/ai-agent-alignment-tests-are-one-shot-a-new-benchmark-catches-multi-step/</guid><description>MoralityGym proves AI agents pass one-shot alignment checks but drift toward violations across multi-step trajectories, a failure mode red-team prompt batteries cannot detect.</description><pubDate>Sun, 24 May 2026 15:22:35 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-24T00:00:00.000Z</atom:updated><category>ai-alignment</category><category>safety-evaluation</category><category>multi-step-agents</category><category>reinforcement-learning</category><category>moral-reasoning</category><category>llm-safety</category><author>Groundy Editorial</author></item><item><title>Microsoft&apos;s Own Numbers Now Show AI Agents Cost More Than the Humans They Replaced</title><link>https://groundy.com/articles/microsofts-own-numbers-now-show-ai-agents-cost-more-than-the-humans-they/</link><guid isPermaLink="true">https://groundy.com/articles/microsofts-own-numbers-now-show-ai-agents-cost-more-than-the-humans-they/</guid><description>Microsoft&apos;s internal data shows token-burning AI agents now exceed the all-in cost of human labor, giving procurement teams vendor-supplied evidence to challenge 2027 renewal.</description><pubDate>Sun, 24 May 2026 11:16:03 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-24T00:00:00.000Z</atom:updated><category>ai-agents</category><category>enterprise-procurement</category><category>token-economics</category><category>ai-costs</category><category>microsoft</category><category>agentic-ai</category><author>Groundy Editorial</author></item><item><title>CISA&apos;s Own Data Leak Has Lawmakers Demanding Answers About the Voluntary Threat-Sharing Pact</title><link>https://groundy.com/articles/cisas-own-data-leak-has-lawmakers-demanding-answers-about-the-voluntary-threat/</link><guid isPermaLink="true">https://groundy.com/articles/cisas-own-data-leak-has-lawmakers-demanding-answers-about-the-voluntary-threat/</guid><description>A CISA contractor exposed admin keys on GitHub for six months, eroding the trust basis for CIRCIA mandatory incident reporting and drawing congressional scrutiny.</description><pubDate>Sat, 23 May 2026 20:08:50 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-24T00:00:00.000Z</atom:updated><category>cisa</category><category>credential-leak</category><category>circia</category><category>information-sharing</category><category>cybersecurity-policy</category><category>incident-reporting</category><category>github</category><author>Groundy Editorial</author></item><item><title>NIH Demands Advance Clearance for Foreign Co-Authors Without a Published Rule</title><link>https://groundy.com/articles/nih-demands-advance-clearance-for-foreign-co-authors-without-a-published-rule/</link><guid isPermaLink="true">https://groundy.com/articles/nih-demands-advance-clearance-for-foreign-co-authors-without-a-published-rule/</guid><description>NIH is requiring advance clearance for foreign co-authors without publishing a formal rule, forcing compliance offices to build a pre-submission gate no one budgeted for.</description><pubDate>Sat, 23 May 2026 14:17:55 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-23T00:00:00.000Z</atom:updated><category>nih</category><category>research-security</category><category>foreign-collaboration</category><category>compliance</category><category>scientific-publishing</category><category>grants-policy</category><author>Groundy Editorial</author></item><item><title>Maryland Enacts First US Ban on Algorithmic Grocery Pricing, Effective Immediately</title><link>https://groundy.com/articles/maryland-enacts-first-us-ban-on-algorithmic-grocery-pricing-effective/</link><guid isPermaLink="true">https://groundy.com/articles/maryland-enacts-first-us-ban-on-algorithmic-grocery-pricing-effective/</guid><description>Governor Moore signed Maryland&apos;s Protection From Predatory Pricing Act on April 28, making it the first US state law to ban AI-based grocery pricing, effective immediately.</description><pubDate>Tue, 19 May 2026 17:20:05 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-19T00:00:00.000Z</atom:updated><category>algorithmic-pricing</category><category>ai-regulation</category><category>consumer-protection</category><category>grocery-retail</category><category>dynamic-pricing</category><category>state-legislation</category><author>Groundy Editorial</author></item><item><title>FTC&apos;s TAKE IT DOWN Act Lands May 19: 48-Hour Deepfake NCII Takedowns and No Safe Harbor</title><link>https://groundy.com/articles/ftcs-take-it-down-act-lands-may-19-48-hour-deepfake-ncii-takedowns-and-no-safe/</link><guid isPermaLink="true">https://groundy.com/articles/ftcs-take-it-down-act-lands-may-19-48-hour-deepfake-ncii-takedowns-and-no-safe/</guid><description>Ferguson&apos;s May 11 warning letters put 15 UGC platforms on notice as Section 3 of the TAKE IT DOWN Act activates May 19, requiring 48-hour NCII removal with no safe harbor.</description><pubDate>Mon, 18 May 2026 21:30:14 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-19T00:00:00.000Z</atom:updated><category>take-it-down-act</category><category>ncii-takedown</category><category>ftc-enforcement</category><category>deepfake-regulation</category><category>content-moderation</category><category>ai-regulation</category><author>Groundy Editorial</author></item><item><title>Frontier AI Has Broken the Open CTF Format: What the Scoreboard Collapse Means for Security Training</title><link>https://groundy.com/articles/frontier-ai-has-broken-the-open-ctf-format-what-the-scoreboard-collapse-means/</link><guid isPermaLink="true">https://groundy.com/articles/frontier-ai-has-broken-the-open-ctf-format-what-the-scoreboard-collapse-means/</guid><description>Frontier AI now autonomously solves medium and hard CTF challenges, collapsing open scoreboards as a measure of human skill and threatening the pipeline for security talent.</description><pubDate>Mon, 18 May 2026 20:20:11 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-10T00:00:00.000Z</atom:updated><category>ctf</category><category>ai-security</category><category>cybersecurity</category><category>talent-pipeline</category><category>scoreboards</category><category>offensive-security</category><category>challenge-design</category><author>Groundy Editorial</author></item><item><title>Frontier AI Broke Open CTFs: What Hack The Box and BearcatCTF 2026 Results Mean for Security Hiring Signals</title><link>https://groundy.com/articles/frontier-ai-broke-open-ctfs-what-hack-the-box-and-bearcatctf-2026-results-mean/</link><guid isPermaLink="true">https://groundy.com/articles/frontier-ai-broke-open-ctfs-what-hack-the-box-and-bearcatctf-2026-results-mean/</guid><description>Frontier AI now ranks in the top 5% of CTFs, eroding leaderboards as a security hiring signal and forcing organizers toward bans, hybrid scoring, or AI-only divisions.</description><pubDate>Mon, 18 May 2026 18:25:06 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-10T00:00:00.000Z</atom:updated><category>ctf</category><category>ai-agents</category><category>cybersecurity</category><category>hiring</category><category>competitions</category><category>infosec</category><category>recruiting</category><author>Groundy Editorial</author></item><item><title>Salesforce Spring &apos;26 Reveals a Default-On AI Training Setting That Predates the Atlassian Backlash</title><link>https://groundy.com/articles/salesforce-spring-26-reveals-a-default-on-ai-training-setting-that-predates/</link><guid isPermaLink="true">https://groundy.com/articles/salesforce-spring-26-reveals-a-default-on-ai-training-setting-that-predates/</guid><description>Salesforce&apos;s Spring &apos;26 toggle surfaced a default-on AI training posture dating to 2018, joining GitHub and Atlassian in a spring wave that shifts privacy burden to buyers.</description><pubDate>Mon, 18 May 2026 18:00:02 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-18T00:00:00.000Z</atom:updated><category>salesforce</category><category>einstein</category><category>ai-training</category><category>data-privacy</category><category>saas-compliance</category><category>vendor-risk</category><category>procurement</category><author>Groundy Editorial</author></item><item><title>Connecticut SB 5 Passes May 1: AI Provenance, AEDT Disclosures, and Chatbot Guardrails by 2027</title><link>https://groundy.com/articles/connecticut-sb-5-passes-may-1-ai-provenance-aedt-disclosures-and-chatbot/</link><guid isPermaLink="true">https://groundy.com/articles/connecticut-sb-5-passes-may-1-ai-provenance-aedt-disclosures-and-chatbot/</guid><description>Connecticut SB 5 requires provenance for large generative platforms by October 2026, AEDT disclosures for HR tools, and companion-chatbot guardrails for minors by January.</description><pubDate>Mon, 18 May 2026 13:57:21 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-18T00:00:00.000Z</atom:updated><category>ai-regulation</category><category>connecticut-sb5</category><category>c2pa-provenance</category><category>aedt-compliance</category><category>companion-chatbots</category><category>state-ai-laws</category><category>engineering-compliance</category><author>Groundy Editorial</author></item><item><title>EU Commission&apos;s May 8 Article 50 Draft Guidelines Pin AI Disclosure to an &apos;Average Consumer&apos; Test</title><link>https://groundy.com/articles/eu-commissions-may-8-article-50-draft-guidelines-pin-ai-disclosure/</link><guid isPermaLink="true">https://groundy.com/articles/eu-commissions-may-8-article-50-draft-guidelines-pin-ai-disclosure/</guid><description>The EU Commission&apos;s May 8 draft guidelines set an &apos;average consumer&apos; standard for AI disclosure exemptions under Article 50, with a multi-factor vulnerable-group test that.</description><pubDate>Mon, 18 May 2026 12:10:20 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-18T00:00:00.000Z</atom:updated><category>eu-ai-act</category><category>ai-transparency</category><category>ai-regulation</category><category>ethics-policy</category><category>compliance</category><category>chatbot-disclosure</category><author>Groundy Editorial</author></item><item><title>White House Drafts FDA-Style Pre-Release Vetting for Frontier AI After Anthropic&apos;s Mythos Disclosure</title><link>https://groundy.com/articles/white-house-drafts-fda-style-pre-release-vetting-for-frontier-ai-after/</link><guid isPermaLink="true">https://groundy.com/articles/white-house-drafts-fda-style-pre-release-vetting-for-frontier-ai-after/</guid><description>The White House is studying FDA-style pre-release vetting for frontier AI after Anthropic&apos;s Mythos disclosure, but a fast walkback and internal feud have left policy in limbo.</description><pubDate>Mon, 18 May 2026 11:46:44 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-06-10T00:00:00.000Z</atom:updated><category>ai-regulation</category><category>frontier-ai</category><category>trump-administration</category><category>anthropic</category><category>openai</category><category>executive-order</category><category>fda</category><author>Groundy Editorial</author></item><item><title>Citizen Lab Names Three Telcos as Persistent Entry Points for Commercial SS7 Surveillance Vendors</title><link>https://groundy.com/articles/citizen-lab-names-three-telcos-as-persistent-entry-points-for-commercial-ss7/</link><guid isPermaLink="true">https://groundy.com/articles/citizen-lab-names-three-telcos-as-persistent-entry-points-for-commercial-ss7/</guid><description>Citizen Lab names 019Mobile, Tango Networks, and Airtel Jersey as persistent entry points for commercial SS7 surveillance vendors, shifting accountability to named carriers.</description><pubDate>Wed, 29 Apr 2026 20:42:57 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-04-29T00:00:00.000Z</atom:updated><category>ss7</category><category>surveillance</category><category>citizen-lab</category><category>telecom-security</category><category>diameter</category><category>ghost-mno</category><category>regulatory-enforcement</category><author>Groundy Editorial</author></item><item><title>California SB 1119 and AB 2023 Cleared Committee April 21: Companion Chatbots Owe Annual AG-Filed Audits</title><link>https://groundy.com/articles/california-sb-1119-and-ab-2023-cleared-committee-april-21-companion-chatbots/</link><guid isPermaLink="true">https://groundy.com/articles/california-sb-1119-and-ab-2023-cleared-committee-april-21-companion-chatbots/</guid><description>California companion-chatbot bills advanced in April 2026, mandating annual AG-filed audits, hard usage caps for minors, and per-child civil liability.</description><pubDate>Wed, 29 Apr 2026 12:03:04 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-04-29T00:00:00.000Z</atom:updated><category>ai-regulation</category><category>chatbot-safety</category><category>california-legislation</category><category>child-protection</category><category>companion-ai</category><category>compliance-risk</category><author>Groundy Editorial</author></item><item><title>Atlassian Turned On AI Training Data Collection by Default: Here&apos;s What to Disable</title><link>https://groundy.com/articles/atlassian-turned-on-ai-training-data-collection-by-default-heres-what-to-disable/</link><guid isPermaLink="true">https://groundy.com/articles/atlassian-turned-on-ai-training-data-collection-by-default-heres-what-to-disable/</guid><description>Atlassian&apos;s data contribution policy sends Jira and Confluence content to AI training by default. Here&apos;s the exact settings path to opt out before August 17.</description><pubDate>Mon, 20 Apr 2026 15:43:48 GMT</pubDate><dc:creator>Groundy Editorial</dc:creator><atom:updated>2026-05-13T00:00:00.000Z</atom:updated><category>data-privacy</category><category>enterprise-ai</category><category>atlassian</category><category>ai-training-data</category><category>saas</category><author>Groundy Editorial</author></item></channel></rss>