Hello everyone,
I wonder if there is a definate way to know the reason of a certain system load.
We usually have a modest system loads, approx. 10-15% evening times. But sometimes it jumps 100% and stays that way until the server is restarted. The wierdness is that there is no gradual growth, it either 10-15% or 100%. The script_profiles doesn't reveal anything unusual (are those even represent the real picture)? I tried killing the scrips with the top instructions percent, still the sysload doesn't respond to that. I wonder if there are some steps can be done to clarify the picture.
Thank a lot in advance.
system load wirdness
Moderator: POL Developer
Re: system load wirdness
what you do is clear the profile counters with POLCore().clear_script_profile_counters(), wait 60 seconds, then do script profiles. This will tell you what is doing the most instructions in the last minute.
Re: system load wirdness
Here's what I use...
It clears the script profiles, waits 60 seconds, then generates a webpage listing all scripts and their instructions, invocations, instructions per invocation, and percentage in the last 60 seconds from most to least. Do this when your server is taking normal load and compare it to when the server is at 100% load.
.scriptstats
pol/scripts/www/scriptstats.src
It clears the script profiles, waits 60 seconds, then generates a webpage listing all scripts and their instructions, invocations, instructions per invocation, and percentage in the last 60 seconds from most to least. Do this when your server is taking normal load and compare it to when the server is at 100% load.
.scriptstats
Code: Select all
use uo;
use os;
const WEBADDRESS := "";//Put your webserver address, port, and script address here. Ex: "http://server.myserver.com:80/scriptstats.ecl"
program scriptstats(who, text)
var waittime := CInt(text)*60;
if (!waittime)
waittime := 60;
endif
SendSysMessage(who, "Clearing script profile counters.");
Polcore().clear_script_profile_counters();
SendSysMessage(who, "Waiting " + CStr(waittime)+ " seconds.");
Sleep(waittime);
var profiles := PolCore().script_profiles;
var orderone := dictionary;
foreach thing in profiles
if (!thing.instr)
continue;
endif
var instructions := thing.instr;
while (orderone[CInt(instructions)])
instructions := instructions+1;
endwhile
orderone[CInt(instructions)] := thing;
sleepms(5);
endforeach
var ordertwo := orderone.keys();
ordertwo.reverse();
var thearray := array;
var counting := 0;
foreach number in ordertwo
counting := counting + 1;
thearray[counting] := orderone[CInt(number)];
sleepms(5);
endforeach
SetGlobalProperty("#Scriptcycles", thearray);
SendSysMessage(who, "Script logging complete.");
OpenURL(who, WEBADDRESS);
Sleep(30);
EraseGlobalProperty("#Scriptcycles");
endprogram
Code: Select all
use uo;
use http;
use os;
program scriptstats()
WriteHTML("<html><head>");
WriteHTML("<STYLE>a{font-family: Arial, Helvetica, sans-serif; color: #FFFFFF; text-decoration: none}</STYLE>");
WriteHTML("<title>CG Stats</title></head><body bgcolor=\"#000000\" text=\"#FFFFFF\">");
WriteHTML("<H1>Status:</H1>");
var pol:=polcore();
var uptime:=pol.uptime;
var hours:=uptime/60/60;
var minutes:=uptime/60%60;
var seconds:=uptime%60;
WriteHTML("Online: "+CStr(EnumerateOnlineCharacters().size())+"<BR>");
WriteHTML("Items: "+pol.itemcount+"<BR>");
WriteHTML("mobiles: "+pol.mobilecount+"<BR>");
WriteHTML("Sysload: "+pol.sysload+"<BR>");
WriteHTML("Severity: "+pol.sysload_severity+"<BR>");
WriteHTML("Events/min: " +pol.events_per_min+ "<BR>");
WriteHTML("skillchecks/min: " +pol.skill_checks_per_min+ "<BR>");
WriteHTML("combatops/min: " +pol.combat_operations_per_min+ "<BR>");
WriteHTML("script errors/min: " + pol.error_creations_per_min+ "<BR>");
WriteHTML("Tasks on time/min: " +pol.tasks_ontime_per_min+ "<BR>");
WriteHTML("Tasks late/min: " +pol.tasks_late_per_min+ "<BR>");
WriteHTML("Tasks late ticks/min: " +pol.tasks_late_ticks_per_min+ "<BR>");
WriteHTML("Instructions/min: " +pol.instr_per_min+ "<BR>");
WriteHTML("Uptime: "+hours+":"+minutes+":"+seconds+"<BR>");
WriteHTML("<HR>");
var sent:=pol.bytes_sent;
var recvd:=pol.bytes_received;
WriteHTML("Bytes Sent: "+sent+"<BR>");
WriteHTML("Bytes Received: "+recvd+"<BR>");
WriteHTML("Average kb/s: "+(sent/1024)/uptime+"<BR>");
WriteHTML("MB per month: "+(sent/1048576)/hours*744+"<BR>");
WriteHTML("<HR>");
WriteHTML("Currently Running Scripts");
var scriptcycles := GetGlobalProperty("#Scriptcycles");
If (!scriptcycles)
WriteHTML("Unavailable... This is only available 30 seconds after the command is done in game.");
else
WriteHTML("<table border=\"\" style=\"border-collapse: collapse\" bordercolor=\"#111111\" width=\"100%\" id=\"AutoNumber1\">");
WriteHTML("<tr><td width=\"40%\">Script Path/Name</td><td width=\"20%\">Instructions</td><td width=\"10%\">Invocations</td><td width=\"20%\">Instr/Invoc</td><td width=\"10%\">Percent</td></tr>");
foreach thing in scriptcycles
WriteHTML("<tr><td width=\"40%\">"+thing.name+"</td><td width=\"20%\">"+thing.instr+"</td><td width=\"10%\">"+thing.invocations+"</td><td width=\"20%\">"+thing.instr_per_invoc+"</td><td width=\"10%\">"+thing.instr_percent+"</td></tr>");
sleepms(2);
endforeach
endif
writehtml("</table></BODY></HTML>");
endprogram
Re: system load wirdness
Thanks you,
Thats the excellent approach to try.
I suspect it to be the npc, because when I killed all the npc scripts the load went back to 5%.
But the actual script doesn't show in the stat, so I think it might be not the instruction counter, but some other thing like error generation or a high number of events.
Thank you anyway for the great idea.
Thats the excellent approach to try.
I suspect it to be the npc, because when I killed all the npc scripts the load went back to 5%.
But the actual script doesn't show in the stat, so I think it might be not the instruction counter, but some other thing like error generation or a high number of events.
Thank you anyway for the great idea.
Re: system load wirdness
Did you do the command before or after you killed the NPC? If it was after, the command cleared all of that info before generating the info it gave to you.
Re: system load wirdness
I did the command before killing npc scripts to gather the data while sysload is high.CWO wrote:Did you do the command before or after you killed the NPC? If it was after, the command cleared all of that info before generating the info it gave to you.
Re: system load wirdness
in 2 minutes:Online: 3
Items: 57241
mobiles: 3723
Sysload: 23
Severity: 175
Events/min: 44
skillchecks/min: 9
combatops/min: 23
script errors/min: 33132
Tasks on time/min: 73
Tasks late/min: 28
Tasks late ticks/min: 73
Instructions/min: 12089356
Uptime: 1:35:14
scripts/ai/spellkillpcs.ecl 4.82968e+06 1120 4312 39.7421
scripts/ai/killpcs.ecl 2.24826e+06 772 2912 18.5003
scripts/ai/slime.ecl 1.66001e+06 372 4462 13.6597
scripts/ai/doppel.ecl 1.12014e+06 259 4324 9.21734
scripts/ai/firebreather.ecl 670262 167 4013 5.5154
top percentage of running scripts still the same...but sysload growsOnline: 3
Items: 57241
mobiles: 3723
Sysload: 53
Severity: 11686
Events/min: 69
skillchecks/min: 6
combatops/min: 3
script errors/min: 428203
Tasks on time/min: 62
Tasks late/min: 51
Tasks late ticks/min: 10437
Instructions/min: 34087569
Uptime: 1:37:35
scripts/ai/spellkillpcs.ecl 4.87889e+06 1120 4356 40.1388
scripts/ai/killpcs.ecl 2.28635e+06 772 2961 18.8099
scripts/ai/slime.ecl 1.66831e+06 372 4484 13.7252
scripts/ai/doppel.ecl 1.11398e+06 259 4301 9.16473
scripts/ai/firebreather.ecl 675626 167 4045 5.5584
any ideas? why it is so?
Re: system load wirdness
Have you looked in log/scripts.log for some runaway scripts?
Re: system load wirdness
there aren't runaway....
Re: system load wirdness
Your instructions per minute almost tripled which is why your sysload also did the same... Were your players fighting a bunch of NPCs at the time? 4000 instructions per NPC per minute seems quite high...
Re: system load wirdness
Now, I used this topic to fix a few things over here.CWO wrote:Your instructions per minute almost tripled which is why your sysload also did the same... Were your players fighting a bunch of NPCs at the time? 4000 instructions per NPC per minute seems quite high...
I noticed several different scripts going "runaway".
How can I fix it? Is it that important to do, so?
Bcuz over here (phenom II X4 965@3.8Ghz 4GB RAM) I got sysload=3-6% all the time at iddle.
On my SV (a mediocre dual core 2180@2.0Ghz 2GB RAM) it is standing still in something life sysload=70-80%.
Thank you!
Re: system load wirdness
Well, it is important to remedy the sysload if it is constantly at 70-80% because then your shard is vulnerable to lagging. How you fix it depends entirely on what scripts are processing the most instructions. If it's AI that is consistently running a high amount of instructions, maybe you have to look into the sleep mode area of the script (the part of the script that executes when no one is around the NPC). If it happens when people are fighting NPCs, you should look into its fight function. In the case of my shard, I found that when the NPC would go through its instructions to calculate its next move toward its target, there was a bug that made it calculate this thousands of times for each move when only one calculation was enough. It didn't show up as a runaway only because it had a wait_for_event(0) in it (this was actually what I designed .scriptstats for, to find runaway scripts that don't get reported as a runaway). If you have something that is reported as a runaway script every now and then, it's not something to be too concerned about but it's saying that there might be a little too much going on in the script too fast. I have a general performance guide posted here http://forums.polserver.com/viewtopic.p ... 432#p16432. It's not really for AI but may help you if other scripts are also running a bit too much.