-
Notifications
You must be signed in to change notification settings - Fork 0
Description
| Field | Value |
|---|---|
| Bugzilla ID | 1252 |
| Reporter | Nanbor Wang |
| Assigned to | DOC Center Support List (internal) |
| Product | ACE |
| Component | ACE Core |
| Version | 5.2.3 |
| Platform / OS | All / All |
| Priority | P3 |
| Severity | normal |
| Status | NEW |
| Resolution | |
| Created | 2002-07-21 22:27:16 -0500 |
Originally posted by Nanbor Wang on 2002-07-21 22:27:16 -0500
A bug report from Stephen Pope scp@predict.com that talks about the problem
better that I had seen and couldnt explain.. Anyway, I should have added a bug
report before, but fell through the cracks :(! Better late than never.
ACE VERSION: 5.2 HOST MACHINE and OPERATING SYSTEM: SUN SPARC Solaris 8 TARGET MACHINE and OPERATING SYSTEM, if different from HOST: COMPILER NAME AND VERSION (AND PATCHLEVEL): Sun Forte 6 Update 2 AREA/CLASS/EXAMPLE AFFECTED: ACE_TP_Reactor DOES THE PROBLEM AFFECT: EXECUTION SYNOPSIS:ACE_TP_Reactor has several problems managing ACE_Event_Handler
registrations, generally stemming from a failure to look for existing
registrations in the suspend_set_.DESCRIPTION:I am experiencing several problems using the ACE_TP_Reactor, all of
which appear to boil down to a failure to consider that a handler's
registration
may be temporarily registered in the suspend_set_ rather than the
wait_set_ if the register/unregister_handler() is being processed while
a thread has been dispatched to the same ACE_Event_Handler.In the TP_Reactor, when an event is to be dispatched to an
ACE_Event_Handler, that handler is temporarily and automatically
suspended so that no other socket events will be dispatched to the
ACE_Event_Handler while the current event is being handled. (See
ACE_TP_Reactor::handle_socket_events at TP_Reactor.cpp:361). When the
event dispatch call returns, the handler is resumed
(TP_Reactor.cpp:393).Suspension is accomplished by copying the read, write and exception
handle sets bit for the handle from the TP_Reactor's wait_set_ to its
suspend_set_. Resumption is accomplished by copying the handle sets bit
for the handle back from the suspend_set_ to the wait_set_ (see
ACE_Select_Reactor_T<>::suspend_i() and
ACE_Select_Reactor_T<>::resume_i()).The problems arise because registration and unregistration, managed by
the ACE_Select_Reactor_Handler_Repository in the ACE_Select_Reactor_Impl
underlying the ACE_TP_Reactor is unaware of the fact that "active"
handlers may be sitting in the suspend_set_ instead of the wait_set_.
Thus, if a handler (and handle) have been registered for READ and WRITE
events, and a WRITE event occurs and is dispatched, and if the
handle_output() method of the event handler then calls
unregister_handler(this, WRITE_MASK | DONT_CALL) so that it will not
receive any more write events, the reactor will inadvertently completely
remove the handler (it should leave it there for READ events) because it
does not see any registrations in the wait_set_. The result is that no
more READ events are ever delivered to the handler when they should be.I have noticed that some ACE users have worked around this particular
problem by having every handler method re-register itself before
returning. This effectively reinstates the handler, but has its own
problems.A second manifestation is that register_handler() requests that are
process while a handler is active (and thus suspended) are registered
directly into the wait_set_. This makes is possible for a second thread
to be immediately dispatched into the same handler while it is still
handling the previous dispatch, thus violating the
one-dispatch-at-a-time semantics of the TP_Reactor. Under some
circumstances, this can lead to things like handle_close() - and the
handler's destructor - being called while a thread is still active in
handle_input() or handle_output(), resulting in that thread executing on
an object which has been deleted out from under it.A cursory look suggests that someone had thought about this problem, and
implemented the mask_ops() methods in ACE_Select_Reactor_T<>, and
overloaded them in TP_Reactor so that when desiring to set/clear bits in
the wait_set_, the suspend_set_ would be checked and if the handle
appeared anywhere there, the bit setting operation would be performed on
the suspend_set_ instead of the wait_set_. This is exactly the sort of
check which is needed to fix the TP_Reactor problems. However, it is in
the ACE_Select_Reactor_Impl class where the mask_ops() methods need to
be available and used instead of direct calls to bit_ops().
SAMPLE FIX/WORKAROUND:Re-registering a handler with the TP_Reactor just before returning from
a handler method helps to work around the first part of the problem, but
has some quirky race conditions of its own.Has this problem been addressed in a release later than 5.2? If not -
and I do need a fix in short order! - is there any reason for me not to
solve it by moving the mask_ops() methods down to the level of the
ACE_Select_Reactor_Impl and ACE_Select_Reactor_Handler_Repository so
that the TP_Reactor can handle these issues safely?