Discussion:
reduce networking latency
David Xu
2014-09-24 18:40:53 UTC
Permalink
Hi Michael,

I found this interesting project from KVM TODO website:

allow handling short packets from softirq or VCPU context
Plan:
We are going through the scheduler 3 times
(could be up to 5 if softirqd is involved)
Consider RX: host irq -> io thread -> VCPU thread ->
guest irq -> guest thread.
This adds a lot of latency.
We can cut it by some 1.5x if we do a bit of work
either in the VCPU or softirq context.
Testing: netperf TCP RR - should be improved drastically
netperf TCP STREAM guest to host - no regression

Would you mind saying more about the work either in the vCPU or
softirq context? Why it is only for short packets handling? Thanks a
lot!


Regards,

Cong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin
2014-09-29 09:04:18 UTC
Permalink
Post by David Xu
Hi Michael,
allow handling short packets from softirq or VCPU context
We are going through the scheduler 3 times
(could be up to 5 if softirqd is involved)
Consider RX: host irq -> io thread -> VCPU thread ->
guest irq -> guest thread.
This adds a lot of latency.
We can cut it by some 1.5x if we do a bit of work
either in the VCPU or softirq context.
Testing: netperf TCP RR - should be improved drastically
netperf TCP STREAM guest to host - no regression
Would you mind saying more about the work either in the vCPU or
softirq context?
For TX, we might be able to execute it directly from VCPU context.
For RX, from softirq context.
Post by David Xu
Why it is only for short packets handling?
That's just one idea to avoid doing too much work like this.

Doing too much work in VCPU context would break pipelining,
likely degrading stream performance.
Work in softirq context is not accounted against the correct
cgroups, doing a lot of work there will mean guest can steal
CPU from other guests.
Post by David Xu
Thanks a
lot!
Regards,
Cong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Xu
2014-10-15 18:30:02 UTC
Permalink
Post by Michael S. Tsirkin
Post by David Xu
Hi Michael,
allow handling short packets from softirq or VCPU context
We are going through the scheduler 3 times
(could be up to 5 if softirqd is involved)
Consider RX: host irq -> io thread -> VCPU thread ->
guest irq -> guest thread.
This adds a lot of latency.
We can cut it by some 1.5x if we do a bit of work
either in the VCPU or softirq context.
Testing: netperf TCP RR - should be improved drastically
netperf TCP STREAM guest to host - no regression
Would you mind saying more about the work either in the vCPU or
softirq context?
For TX, we might be able to execute it directly from VCPU context.
For RX, from softirq context.
Which step is removed for TX and RX compared with vanilla? Or what's
the new path?

TX: guest thread-> host irq?
TX: host irq-> ?

Thanks.
Post by Michael S. Tsirkin
Post by David Xu
Why it is only for short packets handling?
That's just one idea to avoid doing too much work like this.
Doing too much work in VCPU context would break pipelining,
likely degrading stream performance.
Work in softirq context is not accounted against the correct
cgroups, doing a lot of work there will mean guest can steal
CPU from other guests.
Post by David Xu
Thanks a
lot!
Regards,
Cong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Xu
2014-10-22 20:35:07 UTC
Permalink
Post by Michael S. Tsirkin
Post by David Xu
Hi Michael,
allow handling short packets from softirq or VCPU context
We are going through the scheduler 3 times
(could be up to 5 if softirqd is involved)
Consider RX: host irq -> io thread -> VCPU thread ->
guest irq -> guest thread.
This adds a lot of latency.
We can cut it by some 1.5x if we do a bit of work
either in the VCPU or softirq context.
Testing: netperf TCP RR - should be improved drastically
netperf TCP STREAM guest to host - no regression
Would you mind saying more about the work either in the vCPU or
softirq context?
For TX, we might be able to execute it directly from VCPU context.
For RX, from softirq context.
Do you mean for RX, we directly put data to a shared buffer accessed
by guest VM bypassing the IO thread? For TX, in vCPU context network
data is added to the shared buffer and kick host IRQ to send them?
Post by Michael S. Tsirkin
Post by David Xu
Why it is only for short packets handling?
That's just one idea to avoid doing too much work like this.
Doing too much work in VCPU context would break pipelining,
likely degrading stream performance.
Work in softirq context is not accounted against the correct
cgroups, doing a lot of work there will mean guest can steal
CPU from other guests.
Post by David Xu
Thanks a
lot!
Regards,
Cong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin
2014-10-23 04:15:30 UTC
Permalink
Post by David Xu
Post by Michael S. Tsirkin
Post by David Xu
Hi Michael,
allow handling short packets from softirq or VCPU context
We are going through the scheduler 3 times
(could be up to 5 if softirqd is involved)
Consider RX: host irq -> io thread -> VCPU thread ->
guest irq -> guest thread.
This adds a lot of latency.
We can cut it by some 1.5x if we do a bit of work
either in the VCPU or softirq context.
Testing: netperf TCP RR - should be improved drastically
netperf TCP STREAM guest to host - no regression
Would you mind saying more about the work either in the vCPU or
softirq context?
For TX, we might be able to execute it directly from VCPU context.
For RX, from softirq context.
Do you mean for RX, we directly put data to a shared buffer accessed
by guest VM bypassing the IO thread? For TX, in vCPU context network
data is added to the shared buffer and kick host IRQ to send them?
Yes, that's the idea.
Post by David Xu
Post by Michael S. Tsirkin
Post by David Xu
Why it is only for short packets handling?
That's just one idea to avoid doing too much work like this.
Doing too much work in VCPU context would break pipelining,
likely degrading stream performance.
Work in softirq context is not accounted against the correct
cgroups, doing a lot of work there will mean guest can steal
CPU from other guests.
Post by David Xu
Thanks a
lot!
Regards,
Cong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...