Analysis of methods for training domain-specific language models in the area of legal contracts generation
DOI:
https://doi.org/10.30837/2522-9818.2024.2.048Keywords:
large language model; natural language generation; contract; legal document.Abstract
The subject of the research is machine learning models and methods for generating legal contracts with limited resources and performance evaluation benchmarks. The goal of the work is to analyse approaches of domain-specific Large Language Models development and to find the optimal method of creating independent specialized systems that can generate contracts in different languages and legal systems. The article addresses the following tasks: identification of existing companies and solutions in this area, exploring approaches to create texts in natural language, analysis of evaluation and comparison methods of such systems, inspecting limitations and shortcomings of existing solutions and approaches, finding the optimal method of developing systems with limited resources. The following results were obtained: approaches of natural language generation and their features were investigated; the "Transformer" architecture was defined as a modern standard in the field of text information generation; different model types which are based on this architecture were considered; data sources for training were analysed; methods of adapting models in specialized areas were considered; model evaluating benchmarks for various tasks were reviewed; shortcomings of the existing specialized language models and the incompleteness of existing benchmarks for contract generation task evaluation were revealed. As a result of the analytical experiment, it was determined that the Retrieval-Augmented Generation method is the most optimal for solving the given task under the given conditions. The conducted experiment and its results can be used as a basis for further research of domain-specific language models development with limited resources. Conclusions: the article provides an overview of natural language generation methods using modern machine learning techniques, considers their advantages and disadvantages for small companies and scientific institutions that have limited resources. The work examinates a specialized legal domain and the problem of contract generation and determines the most optimal method to solve it.
References
Список літератури
Generative AI for Legal Contracts. Nasdaq. URL: https://www.nasdaq.com/articles/generative-ai-for-legal-contracts (дата звернення: 27.05.2024).
Vaswani A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. Attention is all you need. Advances in neural information processing systems. 31st Conference on Neural Information Processing Systems. 2017. 30. DOI: https://doi.org/10.48550/arXiv.1706.03762
Devlin J., Chang M.W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018. DOI: https://doi.org/10.48550/arXiv.1810.04805
Touvron H. та ін. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288. 2023. DOI: https://doi.org/10.48550/arXiv.2307.09288
Jiang A. Q. та ін. Mixtral of experts. arXiv preprint arXiv:2401.04088. 2024. DOI: https://doi.org/10.48550/arXiv.2401.04088
Wu S. та ін. BloombergGPT: A large language model for finance. arXiv preprint arXiv:2303.17564. 2023. DOI: https://doi.org/10.48550/arXiv.2303.17564
Singhal K. та ін. Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617. 2023. DOI: https://doi.org/10.48550/arXiv.2305.09617
Brown T. та ін. Language models are few-shot learners. Advances in neural information processing systems. 2020. № 33. Р. 1877–1901. DOI: https://doi.org/10.48550/arXiv.2005.14165
Nori H. та ін. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. arXiv preprint arXiv:2311.16452. 2023. DOI: https://doi.org/10.48550/arXiv.2311.16452
Niklaus J., та ін. Multilegalpile: A 689gb multilingual legal corpus. arXiv preprint arXiv:2306.02069. 2023. DOI: https://doi.org/10.48550/arXiv.2306.02069
Guha N. Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models. Advances in Neural Information Processing Systems. 2024. № 36. DOI: https://doi.org/10.48550/arXiv.2308.11462
Hendrycks D. та ін. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300. 2020. DOI: https://doi.org/10.48550/arXiv.2009.03300
Wang A., та ін. Superglue: A stickier benchmark for general-purpose language understanding systems. Advances
in neural information processing systems. 2019. № 32. DOI: https://doi.org/10.48550/arXiv.1905.00537
Chalkidis I., та ін. LexGLUE: A Benchmark Dataset for Legal Language Understanding in English. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022. С. 4310–4330. DOI: https://aclanthology.org/2022.acl-long.297
Niklaus J., та ін. LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. С. 3016–3054. DOI: https://aclanthology.org/2023.findings-emnlp.200
Mabey R. Unveiling our legal AI Assistant. Juro. URL: https://juro.com/blog/legal-ai-assistant (дата звернення: 10.03.2024).
Browne R. An AI just negotiated a contract for the first time ever and no human was involved. CNBC. URL: https://www.cnbc.com/2023/11/07/ai-negotiates-legal-contract-without-humans-involved-for-first-time.html (дата звернення: 10.03.2024).
Ian G. та ін. Generative adversarial nets. Advances in neural information processing systems. 2014. № 27. DOI: https://doi.org/10.48550/arXiv.1406.2661
Lewis P. та ін. Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems. 2020. № 33. Р. 9459–9474. DOI: https://doi.org/10.48550/arXiv.2005.11401
Touvron H., та ін. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. 2023. DOI: https://doi.org/10.48550/arXiv.2302.13971
Vanian J., Leswing K. ChatGPT and generative AI are booming, but the costs can be extraordinary. CNBC. URL: https://www.cnbc.com/2023/03/13/chatgpt-and-generative-ai-are-booming-but-at-a-very-expensive-price.html
(дата звернення: 27.05.2024).
Thoppilan R. та ін. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239. 2022. DOI: https://doi.org/10.48550/arXiv.2201.08239
Hoffmann J. та ін. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556. 2022. DOI: https://doi.org/10.48550/arXiv.2203.15556
Mesnard T. та ін. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295. 2024. DOI: https://doi.org/10.48550/arXiv.2403.08295
Microsoft Copilot for Sales. Microsoft. URL: https://www.microsoft.com/en-us/ai/microsoft-sales-copilot (дата звернення: 27.05.2024).
Luminance's Legal Pre-Trained Transformer. Luminance. URL: https://www.luminance.com/technology.html (дата звернення: 27.05.2024).
Lv K. та ін. Full parameter fine-tuning for large language models with limited resources. arXiv preprint arXiv:2306.09782. 2023. DOI: https://doi.org/10.48550/arXiv.2306.09782
Xu L. та ін. Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment. arXiv preprint arXiv:2312.12148. 2023. DOI: https://doi.org/10.48550/arXiv.2312.12148
Hu Edward J., та ін. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685. 2021. DOI: https://arxiv.org/abs/2106.09685
Karimi Mahabadi, R., Henderson, J., Ruder, S. Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems. 2021. № 34. Р. 1022–1035. DOI: https://doi.org/10.48550/arXiv.2106.04647
Wang Y., та ін. AdaMix: Mixture-of-Adaptations for parameter-efficient model tuning. arXiv preprint arXiv:2205.12410. 2022. DOI: https://doi.org/10.48550/arXiv.2205.12410
Karpukhin V., та ін. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906. 2020. DOI: https://doi.org/10.48550/arXiv.2004.04906
Gao Y. та ін. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997. 2023. DOI: https://doi.org/10.48550/arXiv.2312.10997
Zhang T., та ін. BERTScore: Evaluating Text Generation with BERT. International Conference on Learning Representations. 2020. DOI: https://doi.org/10.48550/arXiv.1904.09675
GPT-4o. OpenAI. URL: https://platform.openai.com/docs/models/gpt-4o (дата звернення: 27.05.2024).
References
"Generative AI for Legal Contracts", available at: https://www.nasdaq.com/articles/generative-ai-for-legal-contracts (last accessed 27.05.2024).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017), "Attention is all you need", Advances in neural information processing systems, № 30. DOI: https://doi.org/10.48550/arXiv.1706.03762
Devlin, J., Chang, M.W., Lee, K., Toutanova, K. (2018), "Bert: Pre-training of deep bidirectional transformers for language understanding", arXiv preprint arXiv:1810.04805. DOI: https://doi.org/10.48550/arXiv.1810.04805
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D. (2023), "Llama 2: Open foundation and fine-tuned chat models", arXiv preprint arXiv:2307.09288. DOI: https://doi.org/10.48550/arXiv.2307.09288
Jiang, A.Q., Sablayrolles, A., Roux, A., Mensch, A., Savary, B., Bamford, C., Chaplot, D.S., Casas, D.D.L., Hanna, E.B., Bressand, F., Lengyel, G. (2024), "Mixtral of experts", arXiv preprint arXiv:2401.04088. DOI: https://doi.org/10.48550/arXiv.2401.04088
Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D., Mann, G. (2023), "Bloomberggpt: A large language model for finance", arXiv preprint arXiv:2303.17564. DOI: https://doi.org/10.48550/arXiv.2303.17564
Singhal, K., Tu, T., Gottweis, J., Sayres, R., Wulczyn, E., Hou, L., Clark, K., Pfohl, S., Cole-Lewis, H., Neal, D., Schaekermann, M. (2023), "Towards expert-level medical question answering with large language models", arXiv preprint arXiv:2305.09617. DOI: https://doi.org/10.48550/arXiv.2305.09617
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S. (2020), "Language models are few-shot learners", Advances in neural information processing systems, № 33, P. 1877–1901. DOI: https://doi.org/10.48550/arXiv.2005.14165
Nori, H., Lee, Y.T., Zhang, S., Carignan, D., Edgar, R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W., Luo, R. (2023), "Can generalist foundation models outcompete special-purpose tuning? case study in medicine", arXiv preprint arXiv:2311.16452. DOI: https://doi.org/10.48550/arXiv.2311.16452
Niklaus, J., Matoshi, V., Stürmer, M., Chalkidis, I., Ho, D.E. (2023), "Multilegalpile: A 689gb multilingual legal corpus", arXiv preprint arXiv:2306.02069. DOI: https://doi.org/10.48550/arXiv.2306.02069
Guha, N., Nyarko, J., Ho, D., Ré, C., Chilton, A., Chohlas-Wood, A., Peters, A., Waldon, B., Rockmore, D., Zambrano, D., Talisman, D. (2024), "Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models", Advances in Neural Information Processing Systems, № 36. DOI: https://doi.org/10.48550/arXiv.1904.09675
Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D., Steinhardt, J. (2020), "Measuring massive multitask language understanding", arXiv preprint arXiv:2009.03300. DOI: https://doi.org/10.48550/arXiv.2009.03300
Wang, A., Pruksachatkun, Y., Nangia, N., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S. (2019), "Superglue: A stickier benchmark for general-purpose language understanding systems", Advances in neural information processing systems, № 32. DOI: https://doi.org/10.48550/arXiv.1905.00537
Chalkidis, I., Jana, A., Hartung, D., Bommarito, M., Androutsopoulos, I., Katz, D., Aletras N. (2022), "LexGLUE: A Benchmark Dataset for Legal Language Understanding in English", Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, P. 4310–4330. DOI: https://aclanthology.org/2022.acl-long.297
Niklaus, J., Matoshi, V., Rani, P., Galassi, A., Stürmer, M., Chalkidis I. (2023), "LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain", Findings of the Association for Computational Linguistics: EMNLP 2023, P. 3016–3054. DOI: https://aclanthology.org/2023.findings-emnlp.200
Mabey, R. "Unveiling our legal AI Assistant", available at: https://juro.com/blog/legal-ai-assistant (last accessed 27.05.2024).
Browne, R. "An AI just negotiated a contract for the first time ever – and no human was involved", available at: https://www.cnbc.com/2023/11/07/ai-negotiates-legal-contract-without-humans-involved-for-first-time.html (last accessed 27.05.2024).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017), "Attention is all you need", Advances in neural information processing systems, № 30. DOI: https://doi.org/10.48550/arXiv.1706.03762
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.T., Rocktäschel, T., Riedel, S. (2020), "Retrieval-augmented generation for knowledge-intensive nlp tasks", Advances in Neural Information Processing Systems, № 33, P. 9459–9474. DOI: https://doi.org/10.48550/arXiv.2005.11401
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A. (2023), "Llama: Open and efficient foundation language models", arXiv preprint arXiv:2302.13971. DOI: https://doi.org/10.48550/arXiv.2302.13971
Vanian, J., Leswing, K. "ChatGPT and generative AI are booming, but the costs can be extraordinary", available at: https://www.cnbc.com/2023/03/13/chatgpt-and-generative-ai-are-booming-but-at-a-very-expensive-price.html (last accessed 27.05.2024).
Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.T., Jin, A., Bos, T., Baker, L., Du, Y., Li, Y. (2022), "Lamda: Language models for dialog applications", arXiv preprint arXiv:2201.08239. DOI: https://doi.org/10.48550/arXiv.2201.08239
Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, E., Casas, D.D.L., Hendricks, L.A., Welbl, J., Clark, A., Hennigan, T. (2022), "Training compute-optimal large language models", arXiv preprint arXiv:2203.15556. DOI: https://doi.org/10.48550/arXiv.2203.15556
Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivière, M., Kale, M.S., Love, J., Tafti, P. (2024), "Gemma: Open models based on gemini research and technology", arXiv preprint arXiv:2403.08295. DOI: https://doi.org/10.48550/arXiv.2307.09288
"Microsoft Copilot for Sales", available at: https://www.microsoft.com/en-us/ai/microsoft-sales-copilot (last accessed 27.05.2024).
"Luminance's Legal Pre-Trained Transformer, available at: https://www.luminance.com/technology.html (last accessed 27.05.2024).
Lv, K., Yang, Y., Liu, T., Gao, Q., Guo, Q., Qiu, X. (2023), "Full parameter fine-tuning for large language models with limited resources", arXiv preprint arXiv:2306.09782. DOI: https://doi.org/10.48550/arXiv.2306.09782
Xu, L., Xie, H., Qin, S.Z.J., Tao, X., Wang, F.L. (2023), "Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment", arXiv preprint arXiv:2312.12148. DOI: https://doi.org/10.48550/arXiv.2312.12148
Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W. (2021), "Lora: Low-rank adaptation of large language models", arXiv preprint arXiv:2106.09685. DOI: https://arxiv.org/abs/2106.09685
Karimi Mahabadi, R., Henderson, J., Ruder, S. (2021), "Compacter: Efficient low-rank hypercomplex adapter layers", Advances in Neural Information Processing Systems, № 34, P. 1022–1035. DOI: https://doi.org/10.48550/arXiv.2106.04647
Wang, Y., Agarwal, S., Mukherjee, S., Liu, X., Gao, J., Awadallah, A.H., Gao, J. (2022), "AdaMix: Mixture-of-Adaptations for parameter-efficient model tuning", arXiv preprint arXiv:2205.12410. DOI: https://doi.org/10.48550/arXiv.2205.12410
Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., Yih, W.T. (2020), "Dense passage retrieval for open-domain question answering", arXiv preprint arXiv:2004.04906. DOI: https://doi.org/10.48550/arXiv.2004.04906
Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, H. (2023), "Retrieval-augmented generation for large language models: A survey", arXiv preprint arXiv:2312.10997. DOI: https://doi.org/10.48550/arXiv.2312.10997
Zhang, T., Kishore, V., Wu, F., Weinberger, K., Artzi Y. (2020), "BERTScore: Evaluating Text Generation with BERT", International Conference on Learning Representations. DOI: https://doi.org/10.48550/arXiv.1904.09675
"GPT-4o", available at: https://platform.openai.com/docs/models/gpt-4o (last accessed 27.05.2024).
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.