【论文列表】通信行业大模型

光通信大模型

Abishek, A., Adanza, D., Alemany, P., Gifre, L., Casellas, R., Martínez, R., Muñoz, R. and Vilalta, R., 2025. End-to-end transport network digital twins with cloud-native SDN controllers and generative AI. Journal of Optical Communications and Networking, 17(7), pp.C70-C81. DT和LLM在SDN中部署的概述，有比较详细的NDT故事，着重添加了网络层（IP层）的相关部分，并在自己的网络控制器TeraFlow中部署；有比较详细的各个功能的介绍，尤其是LLM和intent相关；是很多前述论文的集合体

无线通信大模型

Wu D, Wang X, Qiao Y, Wang Z, Jiang J, Cui S, Wang F. NetLLM: Adapting Large Language Models for Networking. Proceedings of the ACM SIGCOMM 2024 Conference; 2024; 2024. p. 661-678. 顶会
Maatouk A, Ayed F, Piovesan N, De Domenico A, Debbah M, Luo Z-Q. Teleqna: A benchmark dataset to assess large language models telecommunications knowledge. arXiv preprint arXiv:231015051 2023. 通信数据集
Xu S, Thomas CK, Hashash O, Muralidhar N, Saad W, Ramakrishnan N. Large multi-modal models (LMMs) as universal foundation models for AI-native wireless systems. arXiv preprint arXiv:240201748 2024. 多模态
Fontaine J, Shahid A, De Poorter E. Towards a Wireless Physical-Layer Foundation Model: Challenges and Strategies. arXiv preprint arXiv:240312065 2024. 基础大模型
Liu C, Xie X, Zhang X, Cui Y. Large Language Models for Networking: Workflow, Advances and Challenges. arXiv preprint arXiv:240412901 2024.
Zhou H, Hu C, Yuan Y, Cui Y, Jin Y, Chen C, Wu H, Yuan D, Jiang L, Wu D. Large language model (llm) for telecommunications: A comprehensive survey on principles, key techniques, and opportunities. arXiv preprint arXiv:240510825 2024.
Khoramnejad F, Hossain E. Generative AI for the Optimization of Next-Generation Wireless Networks: Basics, State-of-the-Art, and Open Challenges. IEEE Communications Surveys & Tutorials 2025: 1-1. 综述
Chen H, Deng W, Yang S, Xu J, Jiang Z, Ngai ECH, Liu J, Liu X. Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges. IEEE Network 2025: 1-1.
Chen Z, Sun Q, Li N, Li X, Wang Y, Chih-Lin I. Enabling Mobile AI Agent in 6G Era: Architecture and Key Technologies. IEEE Network 2024: 1-1.
Huang Y, Du H, Zhang X, Niyato D, Kang J, Xiong Z, Wang S, Huang T. Large language models for networking: Applications, enabling techniques, and challenges. IEEE Network 2024.
Wang Z, Zhou Y, Shi Y, Letaief KB. Federated Fine-Tuning for Pre-Trained Foundation Models Over Wireless Networks. IEEE Transactions on Wireless Communications 2025: 1-1.
Du J, Lin T, Jiang C, Yang Q, Bader CF, Han Z. Distributed Foundation Models for Multi-Modal Learning in 6G Wireless Networks. IEEE Wireless Communications 2024, 31(3): 20-30.
Chen Z, Zhang Z, Yang Z. Big AI Models for 6G Wireless Networks: Opportunities, Challenges, and Research Directions. IEEE Wireless Communications 2024, 31(5): 164-172.

数理基础大模型

Subramanian S, Harrington P, Keutzer K, Bhimji W, Morozov D, Mahoney MW, Gholami A. Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior. Advances in Neural Information Processing Systems 2024, 36. 科学计算大模型，分析了不同因素的影响
Ye Z, Huang X, Chen L, Liu H, Wang Z, Dong B. Pdeformer: Towards a foundation model for one-dimensional partial differential equations. arXiv preprint arXiv:240212652 2024. 用符号的形式表征了PDE
Hao Z, Su C, Liu S, Berner J, Ying C, Su H, Anandkumar A, Song J, Zhu J. Dpot: Auto-regressive denoising operator transformer for large-scale pde pre-training. arXiv preprint arXiv:240303542 2024. 好文
Wang S, Seidman JH, Sankaran S, Wang H, Pappas GJ, Perdikaris P. Bridging Operator Learning and Conditioned Neural Fields: A Unifying Perspective. arXiv preprint arXiv:240513998 2024. 好文，借鉴了DeepONet结构思想，开源代码运行比较顺畅
Hang Z, Ma Y, Wu H, Wang H, Long M. Unisolver: PDE-Conditional Transformers Are Universal PDE Solvers. arXiv preprint arXiv:240517527 2024. 原理上感觉比较科学
Herde M, Raonić B, Rohner T, Käppeli R, Molinaro R, de Bézenac E, Mishra S. Poseidon: Efficient Foundation Models for PDEs. arXiv preprint arXiv:240519101 2024.
Hao Z, Wang Z, Su H, Ying C, Dong Y, Liu S, Cheng Z, Song J, Zhu J. Gnot: A general neural operator transformer for operator learning. International Conference on Machine Learning; 2023: PMLR; 2023. p. 12556-12569.

时序基础大模型

Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, Bernstein MS, Bohg J, Bosselut A, Brunskill E. On the opportunities and risks of foundation models. arXiv preprint arXiv:210807258 2021. 基础模型
Chang C, Peng W-C, Chen T-F. Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint arXiv:230808469 2023.
Jin M, Wang S, Ma L, Chu Z, Zhang JY, Shi X, Chen P-Y, Liang Y, Li Y-F, Pan S. Time-llm: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:231001728 2023.
Xu M, Yin W, Cai D, Yi R, Xu D, Wang Q, Wu B, Zhao Y, Yang C, Wang S. A survey of resource-efficient llm and multimodal foundation models. arXiv preprint arXiv:240108092 2024.
Wang Q, Qian C, Li X, Yao Z, Shao H. Lens: A Foundation Model for Network Traffic. arXiv preprint arXiv:240203646 2024.
Darlow L, Deng Q, Hassan A, Asenov M, Singh R, Joosen A, Barker A, Storkey A. DAM: Towards A Foundation Model for Time Series Forecasting. arXiv preprint arXiv:240717880 2024.
Li C, Gan Z, Yang Z, Yang J, Li L, Wang L, Gao J. Multimodal foundation models: From specialists to general-purpose assistants. Foundations and Trends® in Computer Graphics and Vision 2024, 16(1-2): 1-214.
-Liang Y, Wen H, Nie Y, Jiang Y, Jin M, Song D, Pan S, Wen Q. Foundation Models for Time Series Analysis: A Tutorial and Survey. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; 2024. pp. 6555-6565.

可能有用的大模型领域洞察

Xiao C, Cai J, Zhao W, Zeng G, Han X, Liu Z, Sun M. Densing Law of LLMs. arXiv preprint arXiv:241204315 2024. 探讨了大模型能力密度的关系，从训练模型大小、轮数、数据量大小、参数量大小等可讨论角度上说比较有启发
Michaud EJ, Liao I, Lad V, Liu Z, Mudide A, Loughridge C, Guo ZC, Kheirkhah TR, Vukelić M, Tegmark M. Opening the AI Black Box: Distilling Machine-Learned Algorithms into Code. Entropy 2024, 26(12). 将算法转化为代码的初步讨论
Farquhar S, Kossen J, Kuhn L, Gal Y. Detecting hallucinations in large language models using semantic entropy. Nature 2024, 630(8017): 625-630. 大模型回答是否靠谱的语义上的分析
Marcondes D, Simonis A, Barrera J. Back to basics to open the black box. Nature Machine Intelligence 2024, 6(5): 498-501. 评论

大模型和通信基础设施

Qian K, Xi Y, Cao J, Gao J, Xu Y, Guan Y, Fu B, Shi X, Zhu F, Miao R. Alibaba HPN: A Data Center Network for Large Language Model Training. Traffic 2024, 1: 2. 分析了大模型在训练过程中的一些流量特征，包含了大模型的一些基础知识
Hu Q, Ye Z, Wang Z, Wang G, Zhang M, Chen Q, Sun P, Lin D, Wang X, Luo Y. Characterization of large language model development in the datacenter. 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24); 2024; 2024. p. 709-729.
Poutievski L, Mashayekhi O, Ong J, Singh A, Tariq M, Wang R, Zhang J, Beauregard V, Conner P, Gribble S, Kapoor R, Kratzer S, Li N, Liu H, Nagaraj K, Ornstein J, Sawhney S, Urata R, Vicisano L, Yasumura K, Zhang S, Zhou J, Vahdat A. Jupiter evolving. Proceedings of the ACM SIGCOMM 2022 Conference; 2022. pp. 66-85.
Liu H, Urata R, Yasumura K, Zhou X, Bannon R, Berger J, Dashti P, Jouppi N, Lam C, Li S, Mao E, Nelson D, Papen G, Tariq M, Vahdat A. Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems. Proceedings of the ACM SIGCOMM 2023 Conference; 2023. pp. 499-515.

大模型与科学发现

Romera-Paredes, B., Barekatain, M., Novikov, A., Balog, M., Kumar, M.P., Dupont, E., Ruiz, F.J., Ellenberg, J.S., Wang, P., Fawzi, O. and Kohli, P., 2024. Mathematical discoveries from program search with large language models. Nature, 625(7995), pp.468-475. 通过大模型的不断选择迭代代码库实现算法的升级
Du M, Chen Y, Wang Z, Nie L, Zhang D. LLM4ED: Large Language Models for Automatic Equation Discovery. arXiv preprint arXiv:240507761 2024.
Ma P, Wang T-H, Guo M, Sun Z, Tenenbaum JB, Rus D, Gan C, Matusik W. LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery. arXiv preprint arXiv:240509783 2024.