MLSQL 1.3.0 Development Version: Enhancing Scheduling, Resource, and Cache
What is MPIP
In the world of MLSQL, a crucial component is the MPIP (MLSQL Project Improvement Proposals). This framework allows for the formalization and review of proposed features, ensuring that they align with the project’s goals and values. Inspired by the more mature and stable Spark project, the MPIP process helps to normalize feature proposals and facilitate a smooth review process.
1.3.0 Version of the Latest Development of Three MPIPs
In the latest development version of MLSQL 1.3.0, three MPIPs have been put forward to address pressing needs in the scene:
- MPIP-1031: Table Cache Function
One of the key challenges in MLSQL is the manual release of cached tables, which can lead to inefficiencies in the batch, flow, and machine learning blocks. To address this, the table cache function has been proposed, allowing for the automatic release of cached tables once a script is completed. This feature greatly facilitates the user experience, enabling the caching of memory data without the need for manual intervention.
Code Snippet:
select 1 as a as table1;! Cache table1 script;
select * from table1 as output;
! Uncache table1;
The table caching feature automatically releases the cache once the script is finished, freeing up system resources.
- MPIP-1045: Built-in Timing Task
The built-in timing task is a vital component of the MLSQL Stack, providing users with a more complete service. This feature enables the scheduling of tasks at specific intervals, allowing for the automation of repetitive tasks. The use of crontab-like syntax makes it simple to implement, as shown below:
Code Snippet:
! Crontab * / 5 * * * * self;
--- you script content
select * from hive1 as hiveTable2;
save ......
This feature simplifies the scheduling of tasks, making it easier for users to automate their workflows.
- MPIP-1047: Dynamically Adjusting Resources
In a production environment, MLSQL Engine is deployed for various scenarios, including analytics, API use, ETL, and machine learning. However, this can put a significant strain on resources, leading to inefficiencies and stress on the system. To address this, a dynamic resource adjustment strategy has been proposed, allowing administrators to add or delete CPU and memory resources via simple commands. This feature enables the development of a control strategy for resource allocation, ensuring that resources are optimized for each scenario.
Code Snippet:
! Resource add 10c;
! Resource remove 10c;
! Resource set 40c;
The dynamic resource adjustment feature allows for the automatic increase or decrease of resources based on the configuration memory ratio.
Conclusion
The latest development version of MLSQL 1.3.0 has introduced three MPIPs to enhance scheduling, resource, and cache functionality. These features aim to improve the user experience, simplify workflows, and optimize resource allocation. By embracing the MPIP process, MLSQL is able to provide more functionality to its users, making it an attractive choice for those seeking a robust and scalable data processing platform.