We present PFluxTTS, a hybrid text-to-speech system addressing three gaps in flow-matching TTS: the stability–naturalness trade-off, weak cross-lingual voice cloning, and limited audio quality from low-rate mel features. Our contributions are: (1) a dual-decoder design combining duration-guided and alignment-free models through inference-time vector-field fusion; (2) robust cloning using a sequence of speech-prompt embeddings in a FLUX-based decoder, preserving speaker traits across languages without prompt transcripts; and (3) a modified PeriodWave vocoder with super-resolution to 48 kHz. On cross-lingual in-the-wild data, PFluxTTS outperforms F5-TTS, FishSpeech, and SparkTTS, matches ChatterBox in naturalness while achieving lower WER, and surpasses ElevenLabs in speaker similarity. Audio demos below.