The performance optimization of unity+lua mainly needs to pay attention to a few points
< div> LuaDLL.lua_setmetatable Attach a metadata to this userdata so that you can use it like transform.position div>
to Lua DLL
The performance optimization of unity+lua mainly needs to pay attention to a few points
< div> LuaDLL.lua_setmetatable Attach a metadata to this userdata so that you can use it like transform.position div>
to Lua DLL
class LuaUtil{ static void SetPos(GameObject obj, float x, float y, float z){obj.transform.position = new Vector3(x, y, z);} }
class Transform2{ public Vector3 position = new Vector3(); } class GameObject2{ public Transform2 transform = new Transform2(); }
public class Transform2 { public Vector3 position = new Vector3(); } public class GameObject2 { public Transform2 transform = new Transform2(); public void SetPos(Vector3 pos) { transform.position = pos; } public void SetPos2(float x, float y, float z) { transform.position.x = x; transform.position.y = y; transform.position.z = z; } } public class GOUtil { private static ListmObjs = new List (); public static GameObject2 GetByID(int id) { if(mObjs.Count == 0) { for (int i = 0; i <1000; i++) { mObjs.Add(new GameObject2()); } } return mObjs[id]; } public static void SetPos(GameObject2 go, Vector3 pos) { go.transform.position = pos; } public static void SetPos2(int id, Vector3 pos) { mObjs[id].transform.position = pos; } public static void SetPos3(int id, float x, float y ,float z) { var t = mObjs[id].transform; t.position.x = x; t.position.y = y; t.position.z = z; } }
1. From fatal Start to talk about gameobj.transform.position = pos
Wording like gameobj.transform.position = pos is a common thing in unity. , But in ulua, it is very bad to use this wording extensively. why?
Because of a short line of code, a lot of things have happened. In order to be more intuitive, we call the key luaapi and ulua-related key steps of this line of code. Listed (subject to ulua+cstolua export, gameobj is GameObject type, pos is Vector3):
First step:
GameObjectWrap.get_transform lua To get transform from gameobj, corresponding to gameobj.transform
LuaDLL.luanet_rawnetobj, change the gameobj in lua to a c# recognizable id
ObjectTranslator.TryGetValue Use this id from ObjectTranslator Get the gameobject object of c#
gameobject.transform There are so many preparations, here is finally the real implementation of c# to get gameobject.transform
ObjectTranslator.AddObject Assign an id to transform, this id will be in lua It is used to represent this transform, and the transform should be saved to the ObjectTranslator for future search
LuaDLL.luanet_newudata Assign a userdata in lua and store the id in it to indicate the transform that will be returned to lua
< p>LuaDLL.lua_setmetatable Attach a metadata to this userdata, so that you can use it like transform.position
LuaDLL.lua_pushvalue
p>
LuaDLL.lua_remove
Step 2:
TransformWrap.set_position Lua wants to set pos to transform. position
LuaDLL.luanet_rawnetobj Turn the transform in Lua into an id that can be recognized by c#
ObjectTranslator.TryGetValue Use this id to get the transform object of c# from ObjectTranslator
p>LuaDLL.tolua_getfloat3 Get the 3 float values of Vector3 from lua and return it to c#
lua_getfield + lua_tonumber 3 times Take the value of xyz and exit the stack
transform.position = new Vector3(x,y,z) has prepared so much, and finally executed transform.position = pos assignment
With just one line of code, so many things have been done matter! If it is c++, a.b.c =
x, after optimization, it is nothing more than taking the address and assigning it to the memory. But here, frequent value fetching, stacking, type conversion from c# to lua, each step is full of cpu time, regardless of the various memory allocations and subsequent GC!
Below we will explain step by step, some of which are actually unnecessary and can be omitted. We can finally optimize it to:
lua_isnumber + lua_tonumber 4 times, all completed
2. Reference the c# object in lua, Expensive
As you can see from the above example, just to get a transform from gameobj, there is already a very expensive c# object, which cannot be used as The pointer is directly used for c operation (in fact, it can be done by pinning through GCHandle, but the performance has not been tested, and the pinning object cannot be managed by gc), so the mainstream lua+unity uses an id to represent the c# object, in C# uses a dictionary to correspond to id and object. At the same time, because of the reference of this dictionary, it is also guaranteed that the c# object will not be garbage collected when Lua has a reference.
Therefore, every time there is an object in the parameter, if you want to convert from the id in lua back to the c# object, you need to do a dictionary lookup; each time you call an object The member method also needs to find the object first, and then do a dictionary lookup.
If this object has been used in Lua before and has not been gc, it is still a matter of checking the dictionary. But if it is found to be a new object that has not been used in Lua, it is the series of preparations in the above example.
If the object you return is only temporarily used in Lua, the situation is even worse! The newly allocated userdata and dictionary indexes may be deleted due to the reference of lua by gc, and then next time you use this object, you have to do various preparations again, resulting in repeated allocation and gc, and the performance is very poor.
The gameobj.transform in the example is a huge trap, because. Transform is only temporarily returned, but you will not quote it at all later, and it will be released by Lua soon, causing you every time later. Transform once, may mean a distribution and gc.
3. It is more expensive to pass unity unique value types (Vector3/Quaternion, etc.) between lua and c#
Since it is said that lua calls c# objects slowly, if every vector3.x has to go through c#, then the performance is basically in a crash, so the mainstream solutions will be Types such as Vector3 are implemented as pure Lua code. Vector3 is a table of {x,y,z}, so that it can be used quickly in Lua.
But after doing this, the representation of Vector3 in c# and lua is completely two things, so the transfer of parameters involves the conversion of lua type and c# type, for example c# pass Vector3 to lua, the whole process is as follows:
A simple parameter transfer requires 3 push parameters, table memory allocation, and 3 table insertions. The performance can be imagined.
So how to optimize? Our test shows that passing three floats directly in the function is faster than passing Vector3.
For example, void SetPos(GameObject obj, Vector3pos) is changed to void SetPos(GameObject obj, float x, floaty, float z). The specific effect can be seen from the following test data. The improvement is very obvious.
4. When passing parameters and returning between lua and c#, do not pass the following types as much as possible:
p>
Severe type: Vector3/Quaternion and other unity value types, arrays
Sub-severe type: bool string various objects
Recommended delivery: int float double
Although it is a parameter transfer of lua and c#, from the perspective of parameter transfer, there is actually a layer of c between lua and c# (after all, lua itself is also realized by c), lua, c, c# Because the representation of many data types and memory allocation strategies are different, these data are transferred between the three, often need to be converted (term parameter mashalling), this conversion consumption will be very different according to different types.
Let’s first talk about the bool string type in the second serious category, which involves the interaction performance consumption of c and c#. According to the official Microsoft documents, in the processing of data types, c# defines Blittable Types and Non-Blittable Types, among which bool and string belong to Non-Blittable Types, which means that their memory representations in c and c# are different, which means that type conversion is required when transferring from c to c#, which reduces performance, while string Also consider memory allocation (copy the memory of the string to the managed heap, and convert between utf8 and utf16).
You can refer to https://msdn.microsoft.com/zh-cn/library/ms998551.aspx, here are more detailed guidelines for the performance optimization of the interaction between c and c# .
The serious category is basically caused by the bottleneck when ulua and other programs try to correspond to the lua object and the c# object.
The consumption of Vector3 equivalent types has been mentioned earlier.
And the array is even worse, because the array in lua can only be represented by table, which is completely different from c#, there is no direct correspondence, so the conversion from c# array to luatable can only be copied one by one , If it involves object/string, etc., it must be converted one by one.
5. For frequently called functions, the number of parameters should be controlled
Whether it is lua The pushint/checkint is still the parameter transfer from c to c#. The parameter conversion is the most important consumption, and it is carried out parameter by parameter. Therefore, the performance of lua calling c# is related to the parameter type and the number of parameters. It's a big deal. Generally speaking, frequently called functions should not exceed 4 parameters, and if functions with more than a dozen parameters are frequently called, you will see obvious performance degradation. You may see 10ms when you call hundreds of times in a frame on a mobile phone. Level time.
6. Prioritize the use of static function export and reduce the use of member method export
p>
As mentioned earlier, if an object wants to access member methods or member variables, it needs to look up lua userdata and c# object references, or look up metatable, which takes a lot of time. Exporting static functions directly can reduce such consumption.
Like obj.transform.position = pos.
The method we recommend is to write a static export function, similar to
class LuaUtil{ static void SetPos(GameObject obj, float x, float y, float z){obj.transform.position = new Vector3(x, y, z);} }
Then LuaUtil.SetPos(obj, pos.x, pos.y,pos.z) in Lua, the performance will be much better, Because the frequent return of transform is omitted, and the gc of lua caused by the frequent temporary return of transform is also avoided.
7. Note that when lua holds a c# object reference, the c# object cannot be released, which is a common cause of memory leaks
As mentioned earlier, the c# object returned to lua is to associate the userdata of lua with the c# object through the dictionary. As long as the userdata in lua is not recycled, the c# object will also be taken by the dictionary Because of the reference, it cannot be recycled.
The most common ones are gameobject and component. If they are referenced in Lua, even if you perform Destroy, you will find that they are still left in the mono heap.
However, because this dictionary is the only connection between lua and c#, it is not difficult to find this problem. It is easy to find out by traversing this dictionary. The dictionary under ulua is in the ObjectTranslator class, and slua is in the ObjectCache class.
8. Consider using only self-managed id in lua instead of directly referencing the c# object
One of the ways to avoid lua from quoting c#
One of the ways to avoid various performance problems caused by object is Assign an id to index the object, and the related c# export function no longer passes object as a parameter, but passes int.
This brings several benefits:
For example, the above LuaUtil.SetPos(GameObject obj, float x, float y, floatz) can be further optimized to LuaUtil.SetPos(int objID, float x, floaty, floatz). Then we record the correspondence between objID and GameObject in our own code. If we can, use an array to record instead of a dictionary, which will have a faster search efficiency. In this way, the time for lua to call c# can be further saved, and the management of objects will be more efficient.
9. Reasonably use the out keyword to return complex return values
Lua returns various types of things similar to passing parameters, but also has various consumption.
For example, Vector3 GetPos(GameObject obj) can be written as void GetPos(GameObject obj, out float x, out float y, out floatz)
On the surface, the number of parameters has increased, but according to the generated export code (we take ulua as the standard), it will change from: LuaDLL.tolua_getfloat3 (including get_field + tonumber 3 times) to isnumber + tonumber 3 times< /p>
get_field is essentially a table lookup, which is definitely slower than isnumber accessing the stack, so doing so will have better performance.
Measured
Okay, I have said so much, don’t take some data to see It's still too obscure. In order to see the consumption of pure language more realistically, we directly did not use gameobj.transform.position in the example, because part of the time is wasted inside unity.
We rewrote a simplified version of GameObject2 and Transform2.
class Transform2{ public Vector3 position = new Vector3(); } class GameObject2{ public Transform2 transform = new Transform2(); }
Then we use several different calling methods to set the position of the transform
Method 1: gameobject.transform.position = Vector3 .New(1,2,3)
Method 2: gameobject:SetPos(Vector3.New(1,2,3))
Method 3: gameobject:SetPos2(1, 2,3)
Method 4: GOUtil.SetPos(gameobject, Vector3.New(1,2,3))
Method 5: GOUtil.SetPos2(gameobjectid, Vector3.New (1,2,3))
Method 6: GOUtil.SetPos3(gameobjectid, 1,2,3)
Respectively perform 1000000 times, the result As follows (the test environment is the windows version, the cpu is i7-4770, the jit mode of luajit is turned off, the mobile phone will be different due to factors such as luajit architecture, il2cpp, etc., but we will further elaborate on this in the next article):
Method 1: 903ms
Method 2: 539ms
Method 3: 343ms
Method 4: 559ms
Method 5: 470ms
Method 6: 304ms
You can see that every step Optimization is obvious improvement, especially removal. Transform acquisition and Vector3 transformation improvements are even greater. We just changed the way of external export without paying a high cost, and we can already save 66% of the time.
Can we actually go further? I can! On the basis of Method 6, we can achieve only 200ms again!
Here is a key point, and we will explain it further in the next luajit integration. Generally speaking, we recommend that the level of Method 6 is sufficient.
This is just the simplest case. There are many common exports (such as GetComponentsInChildren, a performance pit, or a function that passes more than a dozen parameters. ) Everyone needs to be optimized according to their own use. With the analysis of the performance principles behind the lua integration solution we provide, it should be easy to think about how to do it.
The next article will write the second part of lua+unity performance optimization, the performance pits integrated by luajit
比起第一部分这种看导出代码就能大概知道性能消耗的问题,luajit集成的问题要复杂晦涩得多。
附测试用例的c#代码:
public class Transform2 { public Vector3 position = new Vector3(); } public class GameObject2 { public Transform2 transform = new Transform2(); public void SetPos(Vector3 pos) { transform.position = pos; } public void SetPos2(float x, float y, float z) { transform.position.x = x; transform.position.y = y; transform.position.z = z; } } public class GOUtil { private static ListmObjs = new List (); public static GameObject2 GetByID(int id) { if(mObjs.Count == 0) { for (int i = 0; i < 1000; i++ ) { mObjs.Add(new GameObject2()); } } return mObjs[id]; } public static void SetPos(GameObject2 go, Vector3 pos) { go.transform.position = pos; } public static void SetPos2(int id, Vector3 pos) { mObjs[id].transform.position = pos; } public static void SetPos3(int id, float x, float y ,float z) { var t = mObjs[id].transform; t.position.x = x; t.position.y = y; t.position.z = z; } }
来自:https://blog.csdn.net/haihsl123456789/article/details/54017522/
WordPress database error: [Table 'yf99682.wp_s6mz6tyggq_comments' doesn't exist]SELECT SQL_CALC_FOUND_ROWS wp_s6mz6tyggq_comments.comment_ID FROM wp_s6mz6tyggq_comments WHERE ( comment_approved = '1' ) AND comment_post_ID = 3245 ORDER BY wp_s6mz6tyggq_comments.comment_date_gmt ASC, wp_s6mz6tyggq_comments.comment_ID ASC