Analysis of methods for training domain-specific language models in the area of legal contracts generation

Authors

DOI:

https://doi.org/10.30837/2522-9818.2024.2.048

Keywords:

large language model; natural language generation; contract; legal document.

Abstract

The subject of the research is machine learning models and methods for generating legal contracts with limited resources and performance evaluation benchmarks. The goal of the work is to analyse approaches of domain-specific Large Language Models development and to find the optimal method of creating independent specialized systems that can generate contracts in different languages and legal systems. The article addresses the following tasks: identification of existing companies and solutions in this area, exploring approaches to create texts in natural language, analysis of evaluation and comparison methods of such systems, inspecting limitations and shortcomings of existing solutions and approaches, finding the optimal method of developing systems with limited resources. The following results were obtained: approaches of natural language generation and their features were investigated; the "Transformer" architecture was defined as a modern standard in the field of text information generation; different model types which are based on this architecture were considered; data sources for training were analysed; methods of adapting models in specialized areas were considered; model evaluating benchmarks for various tasks were reviewed; shortcomings of the existing specialized language models and the incompleteness of existing benchmarks for contract generation task evaluation were revealed. As a result of the analytical experiment, it was determined that the Retrieval-Augmented Generation method is the most optimal for solving the given task under the given conditions. The conducted experiment and its results can be used as a basis for further research of domain-specific language models development with limited resources. Conclusions: the article provides an overview of natural language generation methods using modern machine learning techniques, considers their advantages and disadvantages for small companies and scientific institutions that have limited resources. The work examinates a specialized legal domain and the problem of contract generation and determines the most optimal method to solve it.

Author Biography

Vitalii Volokhovskyi, Kharkiv National University of Radio Electronics

PhD student at the Department of Software Engineering

References

Список літератури

Generative AI for Legal Contracts. Nasdaq. URL: https://www.nasdaq.com/articles/generative-ai-for-legal-contracts (дата звернення: 27.05.2024).

Vaswani A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. Attention is all you need. Advances in neural information processing systems. 31st Conference on Neural Information Processing Systems. 2017. 30. DOI: https://doi.org/10.48550/arXiv.1706.03762

Devlin J., Chang M.W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018. DOI: https://doi.org/10.48550/arXiv.1810.04805

Touvron H. та ін. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288. 2023. DOI: https://doi.org/10.48550/arXiv.2307.09288

Jiang A. Q. та ін. Mixtral of experts. arXiv preprint arXiv:2401.04088. 2024. DOI: https://doi.org/10.48550/arXiv.2401.04088

Wu S. та ін. BloombergGPT: A large language model for finance. arXiv preprint arXiv:2303.17564. 2023. DOI: https://doi.org/10.48550/arXiv.2303.17564

Singhal K. та ін. Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617. 2023. DOI: https://doi.org/10.48550/arXiv.2305.09617

Brown T. та ін. Language models are few-shot learners. Advances in neural information processing systems. 2020. № 33. Р. 1877–1901. DOI: https://doi.org/10.48550/arXiv.2005.14165

Nori H. та ін. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. arXiv preprint arXiv:2311.16452. 2023. DOI: https://doi.org/10.48550/arXiv.2311.16452

Niklaus J., та ін. Multilegalpile: A 689gb multilingual legal corpus. arXiv preprint arXiv:2306.02069. 2023. DOI: https://doi.org/10.48550/arXiv.2306.02069

Guha N. Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models. Advances in Neural Information Processing Systems. 2024. № 36. DOI: https://doi.org/10.48550/arXiv.2308.11462

Hendrycks D. та ін. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300. 2020. DOI: https://doi.org/10.48550/arXiv.2009.03300

Wang A., та ін. Superglue: A stickier benchmark for general-purpose language understanding systems. Advances

in neural information processing systems. 2019. № 32. DOI: https://doi.org/10.48550/arXiv.1905.00537

Chalkidis I., та ін. LexGLUE: A Benchmark Dataset for Legal Language Understanding in English. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022. С. 4310–4330. DOI: https://aclanthology.org/2022.acl-long.297

Niklaus J., та ін. LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. С. 3016–3054. DOI: https://aclanthology.org/2023.findings-emnlp.200

Mabey R. Unveiling our legal AI Assistant. Juro. URL: https://juro.com/blog/legal-ai-assistant (дата звернення: 10.03.2024).

Browne R. An AI just negotiated a contract for the first time ever and no human was involved. CNBC. URL: https://www.cnbc.com/2023/11/07/ai-negotiates-legal-contract-without-humans-involved-for-first-time.html (дата звернення: 10.03.2024).

Ian G. та ін. Generative adversarial nets. Advances in neural information processing systems. 2014. № 27. DOI: https://doi.org/10.48550/arXiv.1406.2661

Lewis P. та ін. Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems. 2020. № 33. Р. 9459–9474. DOI: https://doi.org/10.48550/arXiv.2005.11401

Touvron H., та ін. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. 2023. DOI: https://doi.org/10.48550/arXiv.2302.13971

Vanian J., Leswing K. ChatGPT and generative AI are booming, but the costs can be extraordinary. CNBC. URL: https://www.cnbc.com/2023/03/13/chatgpt-and-generative-ai-are-booming-but-at-a-very-expensive-price.html

(дата звернення: 27.05.2024).

Thoppilan R. та ін. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239. 2022. DOI: https://doi.org/10.48550/arXiv.2201.08239

Hoffmann J. та ін. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556. 2022. DOI: https://doi.org/10.48550/arXiv.2203.15556

Mesnard T. та ін. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295. 2024. DOI: https://doi.org/10.48550/arXiv.2403.08295

Microsoft Copilot for Sales. Microsoft. URL: https://www.microsoft.com/en-us/ai/microsoft-sales-copilot (дата звернення: 27.05.2024).

Luminance's Legal Pre-Trained Transformer. Luminance. URL: https://www.luminance.com/technology.html (дата звернення: 27.05.2024).

Lv K. та ін. Full parameter fine-tuning for large language models with limited resources. arXiv preprint arXiv:2306.09782. 2023. DOI: https://doi.org/10.48550/arXiv.2306.09782

Xu L. та ін. Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment. arXiv preprint arXiv:2312.12148. 2023. DOI: https://doi.org/10.48550/arXiv.2312.12148

Hu Edward J., та ін. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685. 2021. DOI: https://arxiv.org/abs/2106.09685

Karimi Mahabadi, R., Henderson, J., Ruder, S. Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems. 2021. № 34. Р. 1022–1035. DOI: https://doi.org/10.48550/arXiv.2106.04647

Wang Y., та ін. AdaMix: Mixture-of-Adaptations for parameter-efficient model tuning. arXiv preprint arXiv:2205.12410. 2022. DOI: https://doi.org/10.48550/arXiv.2205.12410

Karpukhin V., та ін. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906. 2020. DOI: https://doi.org/10.48550/arXiv.2004.04906

Gao Y. та ін. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997. 2023. DOI: https://doi.org/10.48550/arXiv.2312.10997

Zhang T., та ін. BERTScore: Evaluating Text Generation with BERT. International Conference on Learning Representations. 2020. DOI: https://doi.org/10.48550/arXiv.1904.09675

GPT-4o. OpenAI. URL: https://platform.openai.com/docs/models/gpt-4o (дата звернення: 27.05.2024).

References

"Generative AI for Legal Contracts", available at: https://www.nasdaq.com/articles/generative-ai-for-legal-contracts (last accessed 27.05.2024).

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017), "Attention is all you need", Advances in neural information processing systems, № 30. DOI: https://doi.org/10.48550/arXiv.1706.03762

Devlin, J., Chang, M.W., Lee, K., Toutanova, K. (2018), "Bert: Pre-training of deep bidirectional transformers for language understanding", arXiv preprint arXiv:1810.04805. DOI: https://doi.org/10.48550/arXiv.1810.04805

Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D. (2023), "Llama 2: Open foundation and fine-tuned chat models", arXiv preprint arXiv:2307.09288. DOI: https://doi.org/10.48550/arXiv.2307.09288

Jiang, A.Q., Sablayrolles, A., Roux, A., Mensch, A., Savary, B., Bamford, C., Chaplot, D.S., Casas, D.D.L., Hanna, E.B., Bressand, F., Lengyel, G. (2024), "Mixtral of experts", arXiv preprint arXiv:2401.04088. DOI: https://doi.org/10.48550/arXiv.2401.04088

Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D., Mann, G. (2023), "Bloomberggpt: A large language model for finance", arXiv preprint arXiv:2303.17564. DOI: https://doi.org/10.48550/arXiv.2303.17564

Singhal, K., Tu, T., Gottweis, J., Sayres, R., Wulczyn, E., Hou, L., Clark, K., Pfohl, S., Cole-Lewis, H., Neal, D., Schaekermann, M. (2023), "Towards expert-level medical question answering with large language models", arXiv preprint arXiv:2305.09617. DOI: https://doi.org/10.48550/arXiv.2305.09617

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S. (2020), "Language models are few-shot learners", Advances in neural information processing systems, № 33, P. 1877–1901. DOI: https://doi.org/10.48550/arXiv.2005.14165

Nori, H., Lee, Y.T., Zhang, S., Carignan, D., Edgar, R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W., Luo, R. (2023), "Can generalist foundation models outcompete special-purpose tuning? case study in medicine", arXiv preprint arXiv:2311.16452. DOI: https://doi.org/10.48550/arXiv.2311.16452

Niklaus, J., Matoshi, V., Stürmer, M., Chalkidis, I., Ho, D.E. (2023), "Multilegalpile: A 689gb multilingual legal corpus", arXiv preprint arXiv:2306.02069. DOI: https://doi.org/10.48550/arXiv.2306.02069

Guha, N., Nyarko, J., Ho, D., Ré, C., Chilton, A., Chohlas-Wood, A., Peters, A., Waldon, B., Rockmore, D., Zambrano, D., Talisman, D. (2024), "Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models", Advances in Neural Information Processing Systems, № 36. DOI: https://doi.org/10.48550/arXiv.1904.09675

Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D., Steinhardt, J. (2020), "Measuring massive multitask language understanding", arXiv preprint arXiv:2009.03300. DOI: https://doi.org/10.48550/arXiv.2009.03300

Wang, A., Pruksachatkun, Y., Nangia, N., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S. (2019), "Superglue: A stickier benchmark for general-purpose language understanding systems", Advances in neural information processing systems, № 32. DOI: https://doi.org/10.48550/arXiv.1905.00537

Chalkidis, I., Jana, A., Hartung, D., Bommarito, M., Androutsopoulos, I., Katz, D., Aletras N. (2022), "LexGLUE: A Benchmark Dataset for Legal Language Understanding in English", Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, P. 4310–4330. DOI: https://aclanthology.org/2022.acl-long.297

Niklaus, J., Matoshi, V., Rani, P., Galassi, A., Stürmer, M., Chalkidis I. (2023), "LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain", Findings of the Association for Computational Linguistics: EMNLP 2023, P. 3016–3054. DOI: https://aclanthology.org/2023.findings-emnlp.200

Mabey, R. "Unveiling our legal AI Assistant", available at: https://juro.com/blog/legal-ai-assistant (last accessed 27.05.2024).

Browne, R. "An AI just negotiated a contract for the first time ever – and no human was involved", available at: https://www.cnbc.com/2023/11/07/ai-negotiates-legal-contract-without-humans-involved-for-first-time.html (last accessed 27.05.2024).

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017), "Attention is all you need", Advances in neural information processing systems, № 30. DOI: https://doi.org/10.48550/arXiv.1706.03762

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.T., Rocktäschel, T., Riedel, S. (2020), "Retrieval-augmented generation for knowledge-intensive nlp tasks", Advances in Neural Information Processing Systems, № 33, P. 9459–9474. DOI: https://doi.org/10.48550/arXiv.2005.11401

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A. (2023), "Llama: Open and efficient foundation language models", arXiv preprint arXiv:2302.13971. DOI: https://doi.org/10.48550/arXiv.2302.13971

Vanian, J., Leswing, K. "ChatGPT and generative AI are booming, but the costs can be extraordinary", available at: https://www.cnbc.com/2023/03/13/chatgpt-and-generative-ai-are-booming-but-at-a-very-expensive-price.html (last accessed 27.05.2024).

Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.T., Jin, A., Bos, T., Baker, L., Du, Y., Li, Y. (2022), "Lamda: Language models for dialog applications", arXiv preprint arXiv:2201.08239. DOI: https://doi.org/10.48550/arXiv.2201.08239

Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, E., Casas, D.D.L., Hendricks, L.A., Welbl, J., Clark, A., Hennigan, T. (2022), "Training compute-optimal large language models", arXiv preprint arXiv:2203.15556. DOI: https://doi.org/10.48550/arXiv.2203.15556

Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivière, M., Kale, M.S., Love, J., Tafti, P. (2024), "Gemma: Open models based on gemini research and technology", arXiv preprint arXiv:2403.08295. DOI: https://doi.org/10.48550/arXiv.2307.09288

"Microsoft Copilot for Sales", available at: https://www.microsoft.com/en-us/ai/microsoft-sales-copilot (last accessed 27.05.2024).

"Luminance's Legal Pre-Trained Transformer, available at: https://www.luminance.com/technology.html (last accessed 27.05.2024).

Lv, K., Yang, Y., Liu, T., Gao, Q., Guo, Q., Qiu, X. (2023), "Full parameter fine-tuning for large language models with limited resources", arXiv preprint arXiv:2306.09782. DOI: https://doi.org/10.48550/arXiv.2306.09782

Xu, L., Xie, H., Qin, S.Z.J., Tao, X., Wang, F.L. (2023), "Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment", arXiv preprint arXiv:2312.12148. DOI: https://doi.org/10.48550/arXiv.2312.12148

Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W. (2021), "Lora: Low-rank adaptation of large language models", arXiv preprint arXiv:2106.09685. DOI: https://arxiv.org/abs/2106.09685

Karimi Mahabadi, R., Henderson, J., Ruder, S. (2021), "Compacter: Efficient low-rank hypercomplex adapter layers", Advances in Neural Information Processing Systems, № 34, P. 1022–1035. DOI: https://doi.org/10.48550/arXiv.2106.04647

Wang, Y., Agarwal, S., Mukherjee, S., Liu, X., Gao, J., Awadallah, A.H., Gao, J. (2022), "AdaMix: Mixture-of-Adaptations for parameter-efficient model tuning", arXiv preprint arXiv:2205.12410. DOI: https://doi.org/10.48550/arXiv.2205.12410

Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., Yih, W.T. (2020), "Dense passage retrieval for open-domain question answering", arXiv preprint arXiv:2004.04906. DOI: https://doi.org/10.48550/arXiv.2004.04906

Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, H. (2023), "Retrieval-augmented generation for large language models: A survey", arXiv preprint arXiv:2312.10997. DOI: https://doi.org/10.48550/arXiv.2312.10997

Zhang, T., Kishore, V., Wu, F., Weinberger, K., Artzi Y. (2020), "BERTScore: Evaluating Text Generation with BERT", International Conference on Learning Representations. DOI: https://doi.org/10.48550/arXiv.1904.09675

"GPT-4o", available at: https://platform.openai.com/docs/models/gpt-4o (last accessed 27.05.2024).

Published

2024-06-30

How to Cite

Volokhovskyi, V. (2024). Analysis of methods for training domain-specific language models in the area of legal contracts generation. INNOVATIVE TECHNOLOGIES AND SCIENTIFIC SOLUTIONS FOR INDUSTRIES, (2(28), 48–64. https://doi.org/10.30837/2522-9818.2024.2.048