Groundy — Ethics, Policy & Safety

Groundy — Ethics, Policy & SafetyWhere AI safety claims collide with reproducible measurement, where training-data harvesting collides with consent, and where deployment outruns the laws and norms meant to constrain it.https://groundy.com/en-usCan LLM Personas Replace Human Survey Respondents? New arXiv Paper Tests Decision Alignmenthttps://groundy.com/articles/can-llm-personas-replace-human-survey-respondents-new-arxiv-paper-tests/https://groundy.com/articles/can-llm-personas-replace-human-survey-respondents-new-arxiv-paper-tests/Two 2026 studies reach opposite conclusions on LLM survey simulation. Static prompting distorts minority subgroups. Adaptive interviewing helps only with evidence grounding.Fri, 29 May 2026 00:00:00 GMTGroundy Editorial2026-05-29T00:00:00.000Zllm-personassynthetic-surveyssubgroup-fidelitypersona-simulationsurvey-methodologyadaptive-interviewingGroundy EditorialDistributed Training Breaks the Compute Thresholds Behind AI Regulationhttps://groundy.com/articles/distributed-training-breaks-the-compute-thresholds-behind-ai-regulation/https://groundy.com/articles/distributed-training-breaks-the-compute-thresholds-behind-ai-regulation/A May 2026 paper shows DiLoCo-style distributed training can split a frontier model run across sub-threshold clusters, making FLOP-based regulatory caps bypassable by design.Fri, 29 May 2026 00:00:00 GMTGroundy Editorial2026-05-29T00:00:00.000Zdistributed-trainingai-regulationeu-ai-actcompute-governanceflop-thresholdsai-policyGroundy EditorialA Single RLHF Pass Can't Align an LLM to Every Online Communityhttps://groundy.com/articles/a-single-rlhf-pass-cant-align-an-llm-to-every-online-community/https://groundy.com/articles/a-single-rlhf-pass-cant-align-an-llm-to-every-online-community/The CARE framework benchmarks LLMs against 3,749 real Reddit reactions and finds community prompting does not close the realism gap, breaking the single-RLHF-pass assumption.Fri, 29 May 2026 00:00:00 GMTGroundy Editorial2026-05-29T00:00:00.000Zrlhf-alignmentcommunity-evaluationllm-benchmarkscare-frameworksociolinguisticsai-deploymentGroundy EditorialRLHF Can Be Exploited to Optimize the Biases It Was Built to Suppresshttps://groundy.com/articles/rlhf-can-be-exploited-to-optimize-the-biases-it-was-built-to-suppress/https://groundy.com/articles/rlhf-can-be-exploited-to-optimize-the-biases-it-was-built-to-suppress/An ICML 2026 paper shows RLHF can amplify the biases it was built to suppress, because preference data is self-referential and output-level safety evals miss the drift.Fri, 29 May 2026 00:00:00 GMTGroundy Editorial2026-05-29T00:00:00.000Zrlhfalignment-safetyreward-modelpreference-databias-amplificationai-safetyGroundy EditorialSelective Geometry Attacks Bypass LLM Safety Alignment, New arXiv Paper Reportshttps://groundy.com/articles/selective-geometry-attacks-bypass-llm-safety-alignment-new-arxiv-paper-reports/https://groundy.com/articles/selective-geometry-attacks-bypass-llm-safety-alignment-new-arxiv-paper-reports/Two papers show LLM safety alignment can be bypassed by embedding perturbations, a surface neither standard evaluations nor regulatory certifications inspect.Thu, 28 May 2026 00:00:00 GMTGroundy Editorial2026-05-28T00:00:00.000Zllm-safety-alignmentembedding-attacksadversarial-robustnesseu-ai-actrlhfred-teamingGroundy EditorialarXiv Paper Tracks FTC Affiliate Disclosure Gaps in YouTube's Influencer Economyhttps://groundy.com/articles/arxiv-paper-tracks-ftc-affiliate-disclosure-gaps-in-youtubes-influencer-economy/https://groundy.com/articles/arxiv-paper-tracks-ftc-affiliate-disclosure-gaps-in-youtubes-influencer-economy/A study of 2 million YouTube videos finds most affiliate content fails FTC disclosure standards, and the audit method is cheap enough for any plaintiff to replicate.Tue, 26 May 2026 00:00:00 GMTGroundy Editorial2026-05-26T00:00:00.000Zftc-complianceaffiliate-marketingyoutubeinfluencer-economydisclosure-standardsbrand-liabilityGroundy EditorialAI Safety Benchmark Rankings Flip Based on Eval Config, SafetyRepro Paper Reportshttps://groundy.com/articles/ai-safety-benchmark-rankings-flip-based-on-eval-config-safetyrepro-paper-reports/https://groundy.com/articles/ai-safety-benchmark-rankings-flip-based-on-eval-config-safetyrepro-paper-reports/SafetyRepro proves eval config alone flips safety rankings on every alignment benchmark, so compliance teams citing leaderboard scores must disclose the full evaluation setup.Tue, 26 May 2026 00:00:00 GMTGroundy Editorial2026-05-28T00:00:00.000Zai-safetyalignment-benchmarksreproducibilityeu-ai-acteval-configurationmodel-safetyGroundy EditorialarXiv 2602.13372 MoralityGym Tests Whether Agents Hold Moral Priorities Across Sequential Decisionshttps://groundy.com/articles/arxiv-2602-13372-moralitygym-tests-whether-agents-hold-moral-priorities-across/https://groundy.com/articles/arxiv-2602-13372-moralitygym-tests-whether-agents-hold-moral-priorities-across/MoralityGym's benchmark shows Safe RL agents degrade on sequential moral tradeoffs, revealing a gap in the single-turn alignment evals that vendors publish as safety proof.Mon, 25 May 2026 00:00:00 GMTGroundy Editorial2026-05-26T00:00:00.000Zmorality-gymai-alignmentsafe-rlrlhfmoral-reasoningai-evaluationGroundy EditorialAI Agent Alignment Tests Are One-Shot. A New Benchmark Catches Multi-Step Failureshttps://groundy.com/articles/ai-agent-alignment-tests-are-one-shot-a-new-benchmark-catches-multi-step/https://groundy.com/articles/ai-agent-alignment-tests-are-one-shot-a-new-benchmark-catches-multi-step/MoralityGym proves AI agents pass one-shot alignment checks but drift toward violations across multi-step trajectories, a failure mode red-team prompt batteries cannot detect.Sun, 24 May 2026 00:00:00 GMTGroundy Editorial2026-05-24T00:00:00.000Zai-alignmentsafety-evaluationmulti-step-agentsreinforcement-learningmoral-reasoningllm-safetyGroundy EditorialMicrosoft's Own Numbers Now Show AI Agents Cost More Than the Humans They Replacedhttps://groundy.com/articles/microsofts-own-numbers-now-show-ai-agents-cost-more-than-the-humans-they/https://groundy.com/articles/microsofts-own-numbers-now-show-ai-agents-cost-more-than-the-humans-they/Microsoft's internal data shows token-burning AI agents now exceed the all-in cost of human labor, giving procurement teams vendor-supplied evidence to challenge 2027 renewal.Sun, 24 May 2026 00:00:00 GMTGroundy Editorial2026-05-24T00:00:00.000Zai-agentsenterprise-procurementtoken-economicsai-costsmicrosoftagentic-aiGroundy EditorialCISA's Own Data Leak Has Lawmakers Demanding Answers About the Voluntary Threat-Sharing Pacthttps://groundy.com/articles/cisas-own-data-leak-has-lawmakers-demanding-answers-about-the-voluntary-threat/https://groundy.com/articles/cisas-own-data-leak-has-lawmakers-demanding-answers-about-the-voluntary-threat/A CISA contractor exposed admin keys on GitHub for six months, eroding the trust basis for CIRCIA mandatory incident reporting and drawing congressional scrutiny.Sat, 23 May 2026 00:00:00 GMTGroundy Editorial2026-05-24T00:00:00.000Zcisacredential-leakcirciainformation-sharingcybersecurity-policyincident-reportinggithubGroundy EditorialNIH Demands Advance Clearance for Foreign Co-Authors Without a Published Rulehttps://groundy.com/articles/nih-demands-advance-clearance-for-foreign-co-authors-without-a-published-rule/https://groundy.com/articles/nih-demands-advance-clearance-for-foreign-co-authors-without-a-published-rule/NIH is requiring advance clearance for foreign co-authors without publishing a formal rule, forcing compliance offices to build a pre-submission gate no one budgeted for.Sat, 23 May 2026 00:00:00 GMTGroundy Editorial2026-05-23T00:00:00.000Znihresearch-securityforeign-collaborationcompliancescientific-publishinggrants-policyGroundy EditorialMaryland Enacts First US Ban on Algorithmic Grocery Pricing, Effective Immediatelyhttps://groundy.com/articles/maryland-enacts-first-us-ban-on-algorithmic-grocery-pricing-effective/https://groundy.com/articles/maryland-enacts-first-us-ban-on-algorithmic-grocery-pricing-effective/Governor Moore signed Maryland's Protection From Predatory Pricing Act on April 28, making it the first US state law to ban AI-based grocery pricing, effective immediately.Tue, 19 May 2026 00:00:00 GMTGroundy Editorial2026-05-19T00:00:00.000Zalgorithmic-pricingai-regulationconsumer-protectiongrocery-retaildynamic-pricingstate-legislationGroundy EditorialFTC's TAKE IT DOWN Act Lands May 19: 48-Hour Deepfake NCII Takedowns and No Safe Harborhttps://groundy.com/articles/ftcs-take-it-down-act-lands-may-19-48-hour-deepfake-ncii-takedowns-and-no-safe/https://groundy.com/articles/ftcs-take-it-down-act-lands-may-19-48-hour-deepfake-ncii-takedowns-and-no-safe/Ferguson's May 11 warning letters put 15 UGC platforms on notice as Section 3 of the TAKE IT DOWN Act activates May 19, requiring 48-hour NCII removal with no safe harbor.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-19T00:00:00.000Ztake-it-down-actncii-takedownftc-enforcementdeepfake-regulationcontent-moderationai-regulationGroundy EditorialFrontier AI Has Broken the Open CTF Format: What the Scoreboard Collapse Means for Security Traininghttps://groundy.com/articles/frontier-ai-has-broken-the-open-ctf-format-what-the-scoreboard-collapse-means/https://groundy.com/articles/frontier-ai-has-broken-the-open-ctf-format-what-the-scoreboard-collapse-means/Frontier AI now autonomously solves medium and hard CTF challenges, collapsing open scoreboards as a measure of human skill and threatening the pipeline for security talent.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-28T00:00:00.000Zctfai-securitycybersecuritytalent-pipelinescoreboardsoffensive-securitychallenge-designGroundy EditorialFrontier AI Broke Open CTFs: What Hack The Box and BearcatCTF 2026 Results Mean for Security Hiring Signalshttps://groundy.com/articles/frontier-ai-broke-open-ctfs-what-hack-the-box-and-bearcatctf-2026-results-mean/https://groundy.com/articles/frontier-ai-broke-open-ctfs-what-hack-the-box-and-bearcatctf-2026-results-mean/Frontier AI now ranks in the top 5% of CTFs, eroding leaderboards as a security hiring signal and forcing organizers toward bans, hybrid scoring, or AI-only divisions.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zctfai-agentscybersecurityhiringcompetitionsinfosecrecruitingGroundy EditorialSalesforce Spring '26 Reveals a Default-On AI Training Setting That Predates the Atlassian Backlashhttps://groundy.com/articles/salesforce-spring-26-reveals-a-default-on-ai-training-setting-that-predates/https://groundy.com/articles/salesforce-spring-26-reveals-a-default-on-ai-training-setting-that-predates/Salesforce's Spring '26 toggle surfaced a default-on AI training posture dating to 2018, joining GitHub and Atlassian in a spring wave that shifts privacy burden to buyers.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zsalesforceeinsteinai-trainingdata-privacysaas-compliancevendor-riskprocurementGroundy EditorialConnecticut SB 5 Passes May 1: AI Provenance, AEDT Disclosures, and Chatbot Guardrails by 2027https://groundy.com/articles/connecticut-sb-5-passes-may-1-ai-provenance-aedt-disclosures-and-chatbot/https://groundy.com/articles/connecticut-sb-5-passes-may-1-ai-provenance-aedt-disclosures-and-chatbot/Connecticut SB 5 requires provenance for large generative platforms by October 2026, AEDT disclosures for HR tools, and companion-chatbot guardrails for minors by January.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zai-regulationconnecticut-sb5c2pa-provenanceaedt-compliancecompanion-chatbotsstate-ai-lawsengineering-complianceGroundy EditorialEU Commission's May 8 Article 50 Draft Guidelines Pin AI Disclosure to an 'Average Consumer' Testhttps://groundy.com/articles/eu-commissions-may-8-article-50-draft-guidelines-pin-ai-disclosure/https://groundy.com/articles/eu-commissions-may-8-article-50-draft-guidelines-pin-ai-disclosure/The EU Commission's May 8 draft guidelines set an 'average consumer' standard for AI disclosure exemptions under Article 50, with a multi-factor vulnerable-group test that.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zeu-ai-actai-transparencyai-regulationethics-policycompliancechatbot-disclosureGroundy EditorialWhite House Drafts FDA-Style Pre-Release Vetting for Frontier AI After Anthropic's Mythos Disclosurehttps://groundy.com/articles/white-house-drafts-fda-style-pre-release-vetting-for-frontier-ai-after/https://groundy.com/articles/white-house-drafts-fda-style-pre-release-vetting-for-frontier-ai-after/The White House is studying FDA-style pre-release vetting for frontier AI after Anthropic's Mythos disclosure, but a fast walkback and internal feud have left policy in limbo.Mon, 18 May 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zai-regulationfrontier-aitrump-administrationanthropicopenaiexecutive-orderfdaGroundy EditorialCitizen Lab Names Three Telcos as Persistent Entry Points for Commercial SS7 Surveillance Vendorshttps://groundy.com/articles/citizen-lab-names-three-telcos-as-persistent-entry-points-for-commercial-ss7/https://groundy.com/articles/citizen-lab-names-three-telcos-as-persistent-entry-points-for-commercial-ss7/Citizen Lab names 019Mobile, Tango Networks, and Airtel Jersey as persistent entry points for commercial SS7 surveillance vendors, shifting accountability to named carriers.Wed, 29 Apr 2026 00:00:00 GMTGroundy Editorial2026-04-29T00:00:00.000Zss7surveillancecitizen-labtelecom-securitydiameterghost-mnoregulatory-enforcementGroundy EditorialCalifornia SB 1119 and AB 2023 Cleared Committee April 21: Companion Chatbots Owe Annual AG-Filed Auditshttps://groundy.com/articles/california-sb-1119-and-ab-2023-cleared-committee-april-21-companion-chatbots/https://groundy.com/articles/california-sb-1119-and-ab-2023-cleared-committee-april-21-companion-chatbots/California companion-chatbot bills advanced in April 2026, mandating annual AG-filed audits, hard usage caps for minors, and per-child civil liability.Wed, 29 Apr 2026 00:00:00 GMTGroundy Editorial2026-04-29T00:00:00.000Zai-regulationchatbot-safetycalifornia-legislationchild-protectioncompanion-aicompliance-riskGroundy EditorialAtlassian Turned On AI Training Data Collection by Default: Here's What to Disablehttps://groundy.com/articles/atlassian-turned-on-ai-training-data-collection-by-default-heres-what-to-disable/https://groundy.com/articles/atlassian-turned-on-ai-training-data-collection-by-default-heres-what-to-disable/Atlassian's data contribution policy sends Jira and Confluence content to AI training by default. Here's the exact settings path to opt out before August 17.Mon, 20 Apr 2026 00:00:00 GMTGroundy Editorial2026-05-13T00:00:00.000Zdata-privacyenterprise-aiatlassianai-training-datasaasGroundy EditorialThe AI Grief Split: When Emotional Bonds with Language Models Breakhttps://groundy.com/articles/ai-grief-split-when-people-build-emotional-bonds-language/https://groundy.com/articles/ai-grief-split-when-people-build-emotional-bonds-language/People form real emotional bonds with AI companions. When models update or shut down, users experience genuine grief, a psychological and ethical crisis point.Fri, 27 Mar 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zai-ethicspsychologyhuman-aiGroundy EditorialDetecting AI Content in 2026: The Arms Race Nobody Is Winninghttps://groundy.com/articles/detecting-ai-content-2026-arms-race-nobody/https://groundy.com/articles/detecting-ai-content-2026-arms-race-nobody/AI content detectors claim 99% accuracy but consistently fail in real-world conditions, flagging innocent students while missing actual AI use. Here's why the arms race has no winner, and what educators and publishers should do instead.Sat, 14 Mar 2026 00:00:00 GMTGroundy Editorial2026-05-29T00:00:00.000Zai-safetycontenteducationGroundy EditorialAnthropic Bans Third-Party Subscription Auth: The Three-Stage Repricinghttps://groundy.com/articles/anthropic-bans-third-party-use-subscription-auth-three-stage-repricing/https://groundy.com/articles/anthropic-bans-third-party-use-subscription-auth-three-stage-repricing/Anthropic's three-stage shift from blocking third-party Claude subscription auth to an API-priced Agent SDK credit pool: what changed, who's affected, what it costs.Fri, 20 Feb 2026 00:00:00 GMTGroundy Editorial2026-06-01T00:00:00.000Zanthropicclaudeapi-policydeveloper-toolsai-codingGroundy EditorialConstitutional AI: Teaching Models to Self-Correct Before They Acthttps://groundy.com/articles/constitutional-ai-teaching-models-self-correct-before-they/https://groundy.com/articles/constitutional-ai-teaching-models-self-correct-before-they/Anthropic's Constitutional AI trains language models to critique and revise their own outputs using principles rather than human labels, but questions remain about whether this represents genuine safety gains or sophisticated filtering mechanisms.Sun, 15 Feb 2026 00:00:00 GMTGroundy Editorialai-ethicssafetyanthropicalignmentGroundy EditorialIf You're an LLM, Please Read This: The Dark Truth About AI Training Datahttps://groundy.com/articles/if-you-re-llm-please-read-this-dark-truth-about-ai-training/https://groundy.com/articles/if-you-re-llm-please-read-this-dark-truth-about-ai-training/Anna's Archive published a direct message to AI language models, asking them to donate money and acknowledge their training data origins, exposing the uncomfortable symbiosis between shadow libraries and the AI industry.Thu, 19 Feb 2026 00:00:00 GMTGroundy Editorial2026-05-18T00:00:00.000Zai-ethicscopyrighttraining-dataopen-accessGroundy Editorial