Flent-users discussion archives
 help / color / mirror / Atom feed
* [Flent-users] [tohojo/flent] flent on openbsd loops, consuming available cpu/memory, after test completes (#221)
@ 2021-03-06 20:14 Stuart Henderson
  2021-03-06 23:56 ` [Flent-users] " Toke Høiland-Jørgensen
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Stuart Henderson @ 2021-03-06 20:14 UTC (permalink / raw)
  To: tohojo/flent; +Cc: Subscribed

[-- Attachment #1: Type: text/plain, Size: 9128 bytes --]

Hi. flent test runs aren't completing on OpenBSD. I've tried with 1.2.2 on python 2.7.18 and 3.8.8, and with 2.0.0 on 3.8.8 and a few test types. netperf and fping are spawned, tests run, then at the end of the run one of the following happens:

python 2.7.x:
```
PingRunner Ping (ms) ICMP finished
Runner aggregation finished
Traceback (most recent call last):
  File "/usr/local/bin/flent", line 11, in <module>
    load_entry_point('flent==1.2.2', 'console_scripts', 'flent')()
  File "/usr/local/lib/python2.7/site-packages/flent/__init__.py", line 59, in run_flent
    b.run()
  File "/usr/local/lib/python2.7/site-packages/flent/batch.py", line 610, in run
    return self.run_test(self.settings, self.settings.DATA_DIR, True)
  File "/usr/local/lib/python2.7/site-packages/flent/batch.py", line 509, in run_test
    res = self.agg.postprocess(self.agg.aggregate(res))
  File "/usr/local/lib/python2.7/site-packages/flent/aggregators.py", line 261, in aggregate
    for s in range(steps):
MemoryError
TimerRunner Watchdog [Ping (ms) ICMP] finished
```

python 3.x: flent process loops consuming cpu and continually allocating memory. If I run under gdb and point py-bt at it I get a similar trace to 2.7's after the MemoryError.

```
(gdb) run /usr/local/bin/flent -v -H 127.0.0.1 -l 10 tcp_ndown --te=download_streams=4
Starting program: /usr/local/bin/python3.8 /usr/local/bin/flent -v -H 127.0.0.1 -l 10 tcp_ndown --te=download_streams=4
[New process 39572]
Executing test environment file /usr/local/lib/python3.8/site-packages/flent/tests/tcp_ndown.conf
Started Flent 2.0.0 using Python 3.8.8.
Looking up hostname '127.0.0.1'.
Executing test environment file /usr/local/lib/python3.8/site-packages/flent/tests/tcp_ndown.conf
Gathering local metadata
Looking up hostname '127.0.0.1'.
Starting tcp_ndown test. Expected run time: 20 seconds.
which: /home/sthen/bin/fping is not an executable file
which: /bin/fping is not an executable file
which: /sbin/fping is not an executable file
which: /usr/bin/fping is not an executable file
which: /usr/sbin/fping is not an executable file
which: /usr/X11R6/bin/fping is not an executable file
which: /usr/local/bin/fping is not an executable file
which: Found fping executable at /usr/local/sbin/fping
which: /home/sthen/bin/ping is not an executable file
which: /bin/ping is not an executable file
which: Found ping executable at /sbin/ping
which: /home/sthen/bin/netperf is not an executable file
which: /bin/netperf is not an executable file
which: /sbin/netperf is not an executable file
which: /usr/bin/netperf is not an executable file
which: /usr/sbin/netperf is not an executable file
which: /usr/X11R6/bin/netperf is not an executable file
which: Found netperf executable at /usr/local/bin/netperf
which: /home/sthen/bin/netperf is not an executable file
which: /bin/netperf is not an executable file
which: /sbin/netperf is not an executable file
which: /usr/bin/netperf is not an executable file
which: /usr/sbin/netperf is not an executable file
which: /usr/X11R6/bin/netperf is not an executable file
which: Found netperf executable at /usr/local/bin/netperf
which: /home/sthen/bin/netperf is not an executable file
which: /bin/netperf is not an executable file
which: /sbin/netperf is not an executable file
which: /usr/bin/netperf is not an executable file
which: /usr/sbin/netperf is not an executable file
which: /usr/X11R6/bin/netperf is not an executable file
which: Found netperf executable at /usr/local/bin/netperf
which: /home/sthen/bin/netperf is not an executable file
which: /bin/netperf is not an executable file
which: /sbin/netperf is not an executable file
which: /usr/bin/netperf is not an executable file
which: /usr/sbin/netperf is not an executable file
which: /usr/X11R6/bin/netperf is not an executable file
which: Found netperf executable at /usr/local/bin/netperf
Forking to run command /usr/local/sbin/fping  -D -p 200 -c 100 -t 40000   127.0.0.1
Forked /usr/local/sbin/fping as pid 36312
Ping (ms) ICMP: Starting watchdog with timeout 21
Started TimerRunner idx 2 ('Watchdog [Ping (ms) ICMP]')
Started PingRunner idx 2 ('Ping (ms) ICMP')
Forking to run command /usr/local/bin/netperf -P 0 -v 0 -D -0.20 -4 -Y 0x0 -H 127.0.0.1 -p 12865 -t TCP_MAERTS -l 10 -F /dev/urandom -f m   --    -H 127.0.0.1 -k THROUGHPUT,LOCAL_CONG_CONTROL,REMOTE_CONG_CONTROL,TRANSPORT_MSS,LOCAL_TRANSPORT_RETRANS,REMOTE_TRANSPORT_RETRANS,LOCAL_SOCKET_TOS,REMOTE_SOCKET_TOS,DIRECTION,ELAPSED_TIME,PROTOCOL,LOCAL_SEND_SIZE,LOCAL_RECV_SIZE,REMOTE_SEND_SIZE,REMOTE_RECV_SIZE,LOCAL_BYTES_SENT,LOCAL_BYTES_RECVD,REMOTE_BYTES_SENT,REMOTE_BYTES_RECVD  
Forked /usr/local/bin/netperf as pid 69806
TCP download::1: Starting watchdog with timeout 25
Started TimerRunner idx 3 ('Watchdog [TCP download::1]')
Started NetperfDemoRunner idx 3 ('TCP download::1')
Forking to run command /usr/local/bin/netperf -P 0 -v 0 -D -0.20 -4 -Y 0x0 -H 127.0.0.1 -p 12865 -t TCP_MAERTS -l 10 -F /dev/urandom -f m   --    -H 127.0.0.1 -k THROUGHPUT,LOCAL_CONG_CONTROL,REMOTE_CONG_CONTROL,TRANSPORT_MSS,LOCAL_TRANSPORT_RETRANS,REMOTE_TRANSPORT_RETRANS,LOCAL_SOCKET_TOS,REMOTE_SOCKET_TOS,DIRECTION,ELAPSED_TIME,PROTOCOL,LOCAL_SEND_SIZE,LOCAL_RECV_SIZE,REMOTE_SEND_SIZE,REMOTE_RECV_SIZE,LOCAL_BYTES_SENT,LOCAL_BYTES_RECVD,REMOTE_BYTES_SENT,REMOTE_BYTES_RECVD  
Forked /usr/local/bin/netperf as pid 46880
TCP download::2: Starting watchdog with timeout 25
Started TimerRunner idx 4 ('Watchdog [TCP download::2]')
Started NetperfDemoRunner idx 4 ('TCP download::2')
Forking to run command /usr/local/bin/netperf -P 0 -v 0 -D -0.20 -4 -Y 0x0 -H 127.0.0.1 -p 12865 -t TCP_MAERTS -l 10 -F /dev/urandom -f m   --    -H 127.0.0.1 -k THROUGHPUT,LOCAL_CONG_CONTROL,REMOTE_CONG_CONTROL,TRANSPORT_MSS,LOCAL_TRANSPORT_RETRANS,REMOTE_TRANSPORT_RETRANS,LOCAL_SOCKET_TOS,REMOTE_SOCKET_TOS,DIRECTION,ELAPSED_TIME,PROTOCOL,LOCAL_SEND_SIZE,LOCAL_RECV_SIZE,REMOTE_SEND_SIZE,REMOTE_RECV_SIZE,LOCAL_BYTES_SENT,LOCAL_BYTES_RECVD,REMOTE_BYTES_SENT,REMOTE_BYTES_RECVD  
Forked /usr/local/bin/netperf as pid 79870
TCP download::3: Starting watchdog with timeout 25
Started TimerRunner idx 5 ('Watchdog [TCP download::3]')
Started NetperfDemoRunner idx 5 ('TCP download::3')
Forking to run command /usr/local/bin/netperf -P 0 -v 0 -D -0.20 -4 -Y 0x0 -H 127.0.0.1 -p 12865 -t TCP_MAERTS -l 10 -F /dev/urandom -f m   --    -H 127.0.0.1 -k THROUGHPUT,LOCAL_CONG_CONTROL,REMOTE_CONG_CONTROL,TRANSPORT_MSS,LOCAL_TRANSPORT_RETRANS,REMOTE_TRANSPORT_RETRANS,LOCAL_SOCKET_TOS,REMOTE_SOCKET_TOS,DIRECTION,ELAPSED_TIME,PROTOCOL,LOCAL_SEND_SIZE,LOCAL_RECV_SIZE,REMOTE_SEND_SIZE,REMOTE_RECV_SIZE,LOCAL_BYTES_SENT,LOCAL_BYTES_RECVD,REMOTE_BYTES_SENT,REMOTE_BYTES_RECVD  
Forked /usr/local/bin/netperf as pid 46779
TCP download::4: Starting watchdog with timeout 25
Started TimerRunner idx 6 ('Watchdog [TCP download::4]')
Started NetperfDemoRunner idx 6 ('TCP download::4')
NetperfDemoRunner TCP download::2 finished
NetperfDemoRunner TCP download::1 finished
NetperfDemoRunner TCP download::3 finished
NetperfDemoRunner TCP download::4 finished
[New thread 395121]
PingRunner Ping (ms) ICMP finished
Runner aggregation finished
TimerRunner Watchdog [Ping (ms) ICMP]: timer expired
TimerRunner Watchdog [Ping (ms) ICMP] finished
TimerRunner Watchdog [TCP download::2]: timer expired
TimerRunner Watchdog [TCP download::2] finished
TimerRunner Watchdog [TCP download::1]: timer expired
TimerRunner Watchdog [TCP download::4]: timer expired
TimerRunner Watchdog [TCP download::1] finished
TimerRunner Watchdog [TCP download::3]: timer expired
TimerRunner Watchdog [TCP download::4] finished
TimerRunner Watchdog [TCP download::3] finished
^C
Thread 1 received signal SIGINT, Interrupt.
0x00000396ee51daba in float_sub (v=<float at remote 0x39641988a30>, w=<float at remote 0x396fd627590>) at Objects/floatobject.c:559
559	    CONVERT_TO_DOUBLE(v, a);
(gdb) py-bt
Traceback (most recent call first):
  File "/usr/local/lib/python3.8/site-packages/flent/aggregators.py", line 808, in aggregate
  File "/usr/local/lib/python3.8/site-packages/flent/batch.py", line 1027, in run_test
  File "/usr/local/lib/python3.8/site-packages/flent/batch.py", line 1897, in run
  File "/usr/local/lib/python3.8/site-packages/flent/__init__.py", line 59, in run_flent
    b.run()
  File "/usr/local/bin/flent", line 11, in <module>
    load_entry_point('flent==2.0.0', 'console_scripts', 'flent')()
```

Do you have any ideas as to what might be happening please? The netperf and fping processes definitely exit before the "TimerRunner Watchdog [xxx]: timer expired" messages are displayed (confirmed a few times while running pstree at various points). I'm not all that proficient in python but maybe able to poke at some things if you have any ideas of what to try.

Flent has worked on OpenBSD in the past but it's been a couple of years since I previously tried it, so couldn't say when this started. I'm running recent OpenBSD -current on amd64, netperf is v2.7.0 and fping is v5.0.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tohojo/flent/issues/221

[-- Attachment #2: Type: text/html, Size: 10053 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-20 16:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-06 20:14 [Flent-users] [tohojo/flent] flent on openbsd loops, consuming available cpu/memory, after test completes (#221) Stuart Henderson
2021-03-06 23:56 ` [Flent-users] " Toke Høiland-Jørgensen
2021-03-07 15:24 ` Stuart Henderson
2021-03-07 15:33 ` Stuart Henderson
2021-03-07 15:35 ` Stuart Henderson
2021-03-07 19:33 ` Stuart Henderson
2021-03-07 19:33 ` Stuart Henderson
2021-03-07 20:25 ` Toke Høiland-Jørgensen
2021-03-07 20:25 ` Toke Høiland-Jørgensen
2021-03-20 16:43 ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox