Eva Qwen 2.5 is a family of uncensored AI roleplay models fine-tuned on top of Alibaba's Qwen 2.5 base. Private LLM runs the 1.5B, 7B, 14B, and 32B variants entirely on-device across iPhone, iPad, and Mac - no prompts leave the phone, no subscription, and no guardrail rejections when you write NSFW fiction or dungeon-master a tabletop session. This post covers which Eva Qwen 2.5 model fits which device, the ChatML system prompt that unlocks the roleplay behavior, and how it stacks up against the newer Qwen3 4B uncensored variants.

Key Takeaways

Eva Qwen 2.5 is uncensored, unlike stock Qwen 2.5. The EVA-UNIT-01 team fine-tuned the Qwen 2.5 weights on a ChatML roleplay dataset so the model drops the refusal layer that Alibaba ships.
Four sizes cover every Apple device tier: 1.5B on 4GB iPhones, 7B on iPhone 15 Pro and iPhone 16, 14B on 16GB iPads and Macs, 32B on 32GB+ Apple Silicon Macs.
Private LLM's OmniQuant quantization keeps a 3-bit Eva Qwen 2.5 32B close to 4-bit RTN quality from Ollama and LM Studio, so you lose less intelligence when the model shrinks to fit unified memory.
Use the ChatML roleplay-instruct system prompt from EVA-UNIT-01's repo. The model behaves more politely and less in-character without it.
For 4GB and 6GB iPhones, the newer Qwen3 4B abliterated and Qwen3 4B heretic variants are often the closer fit; Eva Qwen 2.5 1.5B is for older hardware that cannot hold a 4B.

Tested on iPhone 17 Pro (iOS 26), iPad Pro M4, and MacBook Pro M4 Max with 64GB of unified memory - April 2026.

What Is Eva Qwen 2.5?

Eva Qwen 2.5 is an uncensored, roleplay-focused fine-tune of Alibaba's Qwen 2.5 large language model family. The EVA-UNIT-01 collective on HuggingFace trained the Qwen 2.5 weights on a ChatML roleplay dataset, producing four model sizes (1.5B, 7B, 14B, and 32B) that follow character prompts, generate long-form fiction, and do not refuse adult or violent scenes.

The base Qwen 2.5 family, released by Alibaba in September 2024, spans 0.5B to 72B parameters and supports 29+ languages with strong instruction-following. Stock Qwen 2.5 refuses a significant share of NSFW and roleplay prompts. Eva Qwen 2.5 removes that refusal behavior while keeping the long-context coherence that makes the base model useful for extended fiction.

Private LLM packages all four Eva Qwen 2.5 sizes with OmniQuant or GPTQ quantization, so they run on Apple Silicon without falling back to generic RTN-quantized GGUF files.

Is Qwen 2.5 Uncensored? How Eva Qwen 2.5 Unlocks It

Stock Qwen 2.5 is not uncensored. Eva Qwen 2.5 is. The Alibaba release ships with a safety layer that refuses adult content, graphic violence, and most roleplay scenarios involving power dynamics or dangerous topics. Eva Qwen 2.5 is a community fine-tune that replaces that layer with roleplay-leaning training data, so the model answers directly instead of breaking character with a "sorry, as an AI…" boilerplate.

The difference shows up within the first few turns of a scene. Ask stock Qwen 2.5 to narrate a sword fight between two protagonists and it softens the prose; ask Eva Qwen 2.5 7B the same thing and it writes the blocking, the sound, and the injuries. Ask it to play a morally compromised character and it stays in persona for dozens of turns without drifting back to "I'm just an AI."

Private LLM runs every prompt on-device, so none of that roleplay leaves your phone or Mac. No API key, no server log, no account trail.

Eva Qwen 2.5 vs Other Uncensored AI Roleplay Models

Eva Qwen 2.5 is one of several uncensored roleplay lineages available in Private LLM. The short version: Eva Qwen 2.5 leads on long-form coherence in the 14B-32B sizes, Qwen3 4B leads on iPhone speed and newer architecture, and Euryale on Llama 3.3 70B leads on raw prose quality if you have the Mac to run it.

Eva Qwen 2.5 vs Qwen3 4B Abliterated and Heretic

Qwen3 is the newer Qwen generation. Qwen3 4B abliterated and Qwen3 4B heretic are small, fast, and run comfortably on 6GB and 8GB iPhones. Eva Qwen 2.5 14B and 32B are heavier, slower, and only usable on iPads and Macs, but they carry more character detail across a long scene and drift less at the 15-to-30-turn mark.

Pick a Qwen3 4B variant if you roleplay on an iPhone. Pick Eva Qwen 2.5 14B or 32B if you write long-form fiction on a 16GB+ Mac.

Eva Qwen 2.5 vs Llama 3.3 70B Euryale

Euryale v2.3 on Llama 3.3 70B is the heavyweight option and runs only on 48GB+ Apple Silicon Macs. Prose quality is a step above Eva Qwen 2.5 32B for most scene types, but the download is larger, inference is slower, and the hardware bar is much higher. On a 64GB M4 Max, Euryale is the better roleplay model; on a 32GB Mac, Eva Qwen 2.5 32B is the only option of the two that fits.

For a wider view of uncensored options, see our guide to the best uncensored AI chat apps you can run locally in 2026.

Which Eva Qwen 2.5 Model Is Best for Your Apple Device?

Pick the largest Eva Qwen 2.5 variant your device memory can hold. Quantization in Private LLM compresses the weights, but unified memory is still the ceiling. A 14B will not fit on an 8GB iPhone, and pushing the 32B onto a 16GB Mac will swap and stall mid-reply.

Eva Qwen 2.5 7B running on iPhone inside Private LLM, showing an uncensored roleplay reply — Eva Qwen 2.5 7B in Private LLM on iPhone, holding character across a long roleplay turn.

Eva D Qwen2.5 1.5B v0.0 for 4GB iPhones

Runs on iPhones with 4GB or more RAM: iPhone 12, iPhone 12 mini, iPhone 13, iPhone 13 mini, iPhone SE (3rd generation). The 1.5B is the right option when your phone cannot hold a 4B-class model. For 6GB iPhones that can fit a 4B, we now recommend Qwen3 4B abliterated instead.

Eva Qwen2.5 7B v0.1 for iPhone 15 Pro, iPhone 16, and 8GB iPad

Runs on devices with 8GB or more RAM: iPhone 15 Pro, iPhone 16, iPhone 16 Pro, 8GB iPad Air M2, 8GB iPad Pro, and any 8GB+ Apple Silicon Mac. The 7B is the lightest Eva Qwen variant with enough parameters to carry a long scene without forgetting side characters.

Eva Qwen2.5 14B v0.2 for 16GB iPad Pro M4 or Mac

Runs on devices with 16GB or more RAM: iPad Pro M4 configurations with 16GB+ memory and any Apple Silicon Mac with 16GB+ unified memory. This is the tier where Eva Qwen 2.5 starts noticeably out-writing its smaller siblings on long-form fiction.

Eva Qwen2.5 32B v0.2 for 32GB+ Mac

Mac-only, 32GB or more RAM. The 32B handles 30+ turn roleplay scenes, complex character webs, and extended context without the small-model drift that shows up in the 7B and 14B around the 15-turn mark.

Which Qwen Model Is Best for Roleplay?

For roleplay quality at each tier: Qwen3 4B abliterated on iPhone, Eva Qwen 2.5 14B on a 16GB iPad or Mac, Eva Qwen 2.5 32B on a 32GB+ Mac, and Euryale on Llama 3.3 70B on a 48GB+ Mac. Character consistency, context length, and refusal rate are the three things that actually matter; this shortlist balances them against what your hardware can hold.

Can Qwen Do NSFW Roleplay?

Stock Qwen refuses most explicit NSFW content, but uncensored fine-tunes like Eva Qwen 2.5 answer without the safety-layer refusals. Alibaba's default Qwen 2.5 weights include guardrails that block sexual content, graphic violence, and many morally compromised roleplay scenarios. Community fine-tunes (Eva Qwen 2.5, Qwen3 4B abliterated, Qwen3 4B heretic) are trained without those refusals.

Two things matter when you want reliable NSFW roleplay on a local model:

Use a fine-tune, not the base model. Jailbreak prompts fail on stock Qwen 2.5 more often than they succeed; Eva Qwen 2.5 does not need one.
Use the right system prompt. The ChatML roleplay-instruct prompt from EVA-UNIT-01's repo is load-bearing. Without it, Eva Qwen still answers, but with less character commitment and occasional polite retreats.

Because Private LLM runs every prompt on-device, your scene stays on your phone. Nothing is sent to a server, there is no account to tie a conversation to, and there is no rate limit.

System Prompt for Eva Qwen 2.5 Roleplay

EVA-UNIT-01 ships a ChatML roleplay-instruct system prompt in the model repo. Paste it into Private LLM's system prompt field before you start a scene. Eva Qwen 2.5 is significantly better with it than without.

Starter line:

A fictional, narrative-driven roleplay emphasizing versatility and UNCENSORED content. Adherence to the Role-playing Guidelines is mandatory. Refer to the Role-play Context for accurate information.

Full prompt JSON: ChatML Roleplay-v1.9 Instruct.

For longer scenes, pair the system prompt with a character sheet in the first user turn. Eva Qwen 2.5 holds the sheet across Private LLM's 8K iPhone / 32K Mac context without needing a separate RAG store. See our story-writing and roleplay guide for the sheet template we use on the 70B side of the family.

Why Run Eva Qwen 2.5 on Private LLM Instead of Ollama or LM Studio?

Private LLM uses OmniQuant and GPTQ quantization. Ollama and LM Studio default to RTN (round-to-nearest) quantization, which is simpler to produce and less faithful to the original weight distribution. On the same Mac, a 3-bit OmniQuant Eva Qwen 2.5 32B in Private LLM tracks close to a 4-bit RTN build of the same model at a smaller memory footprint, and runs without a separate CLI.

Practical differences:

Better perplexity at the same bit width. OmniQuant preserves more of the model's original behavior when it shrinks, so the prose stays coherent further into a scene.
Runs on iPhone and iPad. LM Studio is a desktop app for macOS, Windows, and Linux. Ollama has desktop apps and a CLI/API, but not a native iOS or iPadOS chat app. Private LLM runs the 7B on iPhone 15 Pro and the 32B on a 32GB Mac from the same codebase.
One App Store purchase, Family Sharing for six. No subscription, no API key, no account. See our Ollama vs Private LLM comparison and LM Studio vs Private LLM comparison for the side-by-side.

Frequently Asked Questions

Is Eva Qwen 2.5 Actually Uncensored?

Yes. Eva Qwen 2.5 is a community fine-tune of Alibaba's Qwen 2.5 that replaces the base model's safety-refusal behavior with roleplay-leaning training data. It writes NSFW fiction, graphic scenes, and morally compromised characters without breaking the fourth wall to refuse.

Which Eva Qwen 2.5 Model Should I Download First?

Pick the largest one that fits your device memory. A 32GB+ Mac should download the 32B; a 16GB Mac or iPad Pro M4 the 14B; an iPhone 15 Pro or iPhone 16 the 7B. On older iPhones with 4GB of RAM, use Eva Qwen 2.5 1.5B, or Qwen3 4B abliterated if your phone has 6GB or 8GB.

Does Qwen AI Allow NSFW Content?

Stock Qwen models refuse most explicit NSFW prompts out of the box. Uncensored community fine-tunes like Eva Qwen 2.5, Qwen3 4B abliterated, and Qwen3 4B heretic strip that refusal layer and answer NSFW roleplay prompts directly. In Private LLM, the entire exchange stays on-device.

Can My iPhone Actually Run Eva Qwen 2.5?

Yes, on the right tier. iPhone 15 Pro, iPhone 16, and iPhone 16 Pro have 8GB of RAM and run the 7B variant. Older iPhones with 4GB (iPhone 12 through iPhone 13 and iPhone SE 3rd generation) run the 1.5B. Private LLM downloads the weights once; after that, inference runs fully offline.

Is Eva Qwen 2.5 Better Than Qwen3 4B for Roleplay?

On a 16GB+ Mac or iPad Pro, Eva Qwen 2.5 14B and 32B beat Qwen3 4B on long-form coherence because they have more parameters to work with. On an iPhone, Qwen3 4B abliterated is faster and uses the newer Qwen3 architecture, so it is often the better everyday roleplay choice on a phone.

Does Private LLM Send My Roleplay Prompts Anywhere?

No. Every prompt stays on-device. Private LLM runs inference locally, does not require an account, does not make network calls after the initial model download, and does not log conversations. See the privacy policy for scope and limits.

What System Prompt Works Best for Eva Qwen 2.5?

Use EVA-UNIT-01's ChatML roleplay-instruct JSON, linked above. It sets the character frame, tells the model that refusals are off the table in-scene, and defines how it should handle the Role-playing Guidelines and Role-play Context blocks. Without this prompt, the model still works but reads more hedged.

Start Roleplaying With Eva Qwen 2.5 Today

Eva Qwen 2.5 is the roleplay-focused Qwen fine-tune that runs natively on Apple Silicon. Download Private LLM once, pick the model size that fits your device, paste the EVA-UNIT-01 roleplay system prompt, and start writing offline, uncensored, with no subscription. For the broader uncensored lineup in the app, read our best uncensored AI chat guide for 2026, or jump straight to the Llama 3.3 70B uncensored release if you have a 48GB+ Mac.

Download Private LLM on the App Store and unlock Eva Qwen 2.5 on iPhone, iPad, and Mac.