Initialization of channel's local state is not done at the time of
fetching context from list of channels. This leads to race condition
if remote close happens during this time. Remote close will check if
local state is not open then delete channel from list. This leads to
use after free scenerio.
Initialize local state at the time of fetching channel context from
list of channels.
CRs-Fixed: 2155992
Change-Id: If113daba129191bd67ef2460eb4e87c2d5614403
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Glink does not wait for pil to inform about subsystem up. It triggers
link up on first interrupt processed after ssr, this can cause stability
issues if some delayed interrupt is processed after ssr.
Glink waits for PIL to notify about subsystem up and initializes
its state only after that.
CRs-Fixed: 2165753
Change-Id: I71614e6d7e68bf2fa12ac7f27894492019bd3829
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Glink log in tx_common uses context based logging after
wait_for_completion_timeout. This can lead to use after free
scenerio, since transport of context can be freed during wait time.
Use glink error logging.
CRs-Fixed: 2164929
Change-Id: If66bcb7cba1772c2648c143f43a3b88af0799844
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
During SSR, down vote transport of xprt is not called.
This leads to transport not being able to go to idle state.
Downvote of transport is called in SSR path.
CRs-Fixed: 2131780
Change-Id: Ic374073187aab95b700aa3f795787819f34d3c3c
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Dummy transport is only way to access if_ptr. When dummy
transport is freed, if_ptr allocated for dummy transport is
not freed. This result in memory leak.
kfree of if_ptr is called before freeing dummy transport.
CRs-Fixed: 2116744
Change-Id: I832e0fcde418b7c3d992f50e817866bc9075da3c
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Data of intent is not freed even in purge_intent_list. This results
in memory leak.
Kfree is done for data before freeing intent.
CRs-Fixed: 2116744
Change-Id: Ib99261208df1cc9b63b4cd0a35ac0c7942efb4a8
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Few function pointers are left uninitialized in dummy transport.
System can crash if these function pointer get dereferenced.
Initialize all the function pointers which can get called, with
dummy functions.
CRs-Fixed: 2067859
Change-Id: I9172776d9ffa0af5deb9898125fc6403fdcdee0f
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
In function ch_name_to_ch_ctx_create reference for ctx is taken
without checking if ctx is valid. This leads to possible NULL pointer
dereference.
Take reference only when it is verified that ctx is not NULL.
CRs-Fixed: 2059742
Change-Id: I15998780b602e325a90e7c8c303cd442c5381fe8
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Possible use after free issue while accessing magic number,
if the ctx is already freed.
Magic number check is removed.
CRs-Fixed: 2061287
Change-Id: Ie157a930c7eb310829766319e0af742114337e6c
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
TX_info is allocated after pop remote intent, this can cause problem
when there is no memory for allocation then glink has to push back the
intent, which again needs memory.
Tx_info allocation is moved before op remote intent.
CRs-Fixed: 2063427
Change-Id: I4f174c4b0143454596ac8f7a1c639c853b98a2ce
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
In glink_open function, channel context initialization with transport
pointer is done quite after after its creation. This create race condition,
if parallel thread try to use transport pointer of ctx.
Ctx is initialized with transport pointer right at the time of its
creation.
CRs-Fixed: 2061645
Change-Id: Idcddf1ab10b8673a20bc1f23d8702bf870f79dbd
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Reset for qos rate of xprt is not done duering SSR, this leads to
exhaustion of qos bandwidth when multiple SSR happens.
Reset qos rate of xprt to zero when link goes down.
CRs-Fixed: 2061061
Change-Id: Ibabca5584b01eb93a5b7fcc8a5304136ef400ba0
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Low latency use cases are failing because glink RX thread to handle
the TX Done command is not being scheduled during high system load.
These new APIs allow clients to specify if they need the RX glink
thread to be Realtime
CRs-Fixed: 2050701
Change-Id: I6bd4023394e9ee617797826687f34abaee3fe65d
Signed-off-by: Chris Lew <clew@codeaurora.org>
Inside glink_open reference for channel context is only initialized,
but additional reference is not taken. It creates the possibility of use
after free if SSR happens before glink_open function completes.
Additional reference is taken to ensure context stay valid during
glink_open, even if SSR happens.
CRs-Fixed: 2031123
Change-Id: I94650d2f937416aff33a82073c4db76fab0d0e96
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
In core_channel_cleanup function channel is moved to dummy xprt
without taking channel lock. This leads to race condition where
transport poniter is pointing to dummy but channel still belong
to old transport.
Channel is moved to dummy with channel lock.
CRs-Fixed: 2005731
Change-Id: I91903140c1bfa29d909847f318d1339bb717fffc
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Initialize values for variables that may be used with out
the value being set in glink corner cases.
CRs-Fixed: 2004073
Change-Id: If0e813bf1601dd6c1288bc22864ddd2fb3dbf90f
Signed-off-by: Chris Lew <clew@codeaurora.org>
In function glink_core_remote_close_common, notify_state callback
is called before clearing wait queue. This leads to deadlock if client
want to synchronize tx and state notify function.
Complete_all is called before notify client about state change,
so that all pending requests from client will be cleared.
CRs-Fixed: 1107652
Change-Id: Ia6c4a305eb42c014a928bad36491e6e5f6eac9d5
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Glink clients are not notified of tx transactions that are waiting
for remote rx done commands during SSR. This change adds a
notify_tx_abort call for any pending packets during intent purge.
Change-Id: I6a6ba17e2dffddc5cdc2de00da737fedf03c9476
Signed-off-by: Chris Lew <clew@codeaurora.org>
In function edge_name_to_ctx_create, NULL check is missing after
kzalloc for edge_ctx variable.
NULL check validation is added.
CRs-Fixed: 1086686
Change-Id: Icbffbd9d02df97bda531353c41a7025b95a53991
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
In function glink_core_register_transport, deinit function for qos
configuration is called before initializing qos configuration.
Call to glink_core_deinit_xprt_qos_cfg function is removed.
CRs-Fixed: 1088375
Change-Id: Ifffab071efed56541e763e4f6f51aa45d7a6678b
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Glink channel context is initialized with magic number after sending
open command to remote side.
Initialization is fixed and happen before sending open command.
CRs-Fixed: 1075481
Change-Id: Ia6b28a3b35a4093aea7af1cffea2a5e093d33ccd
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Initialize the return value in the glink tx scheduler
function.
CRs-Fixed: 1067981
Change-Id: I3f78196927501f582c36d5815096581185d797b4
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Initialize the return value in the glink tx scheduler
function.
CRs-Fixed: 1067981
Change-Id: I7cad7a724666f34bce73d40e4975373604fb1e87
Signed-off-by: Chris Lew <clew@codeaurora.org>
Inside glink_scheduler_tx tx_info is not validated after tx operation
and taking spin lock, since there are two functions which can release
the reference for tx_info while glink_scheduler_tx thread is preempted.
These functions are ch_purge_intent_lists and
ch_remove_tx_pending_remote_done.
Validate tx_info from tx_active list after tx operation and taking
spin lock.
CRs-Fixed: 1061565
Change-Id: I80c64d66625b9fe9205e8ffaa7cfc851e06fcb94
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Glink core channel cleanup has a lot of code under spinlock with
preemption disabled this leads to deadlock scenerio.
Spinklock is used only for critical section, rest of the code is
be without spinlock.
CRs-Fixed: 1060407
Change-Id: I577dbff1cf2ee3711e1879aaa6dc48c72f98b98c
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Glink does not validate the received handle from client apis.
This leads to possibility of illegal memeory access.
Magic number is added along with rcu lock to validate handle
received from client.
CRs-Fixed: 1047743
Change-Id: I08c854d5885672cbe5410efe0736640b55de8bbb
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Using first fit algorithm to select the remote rx intent from the
list is not optimal way.
Optimize the selection of intent from list using the best fit algorithm.
CRs-Fixed: 1058750
Change-Id: I7b2a70188975b75a0fbcd2a6cb26f28cc0258532
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
GLink SPI Transport enables point-to-point communication with an
external subsystem that uses SPI bus to interface. This enables
multiplexing multiple logical channels over the SPI bus.
CRs-Fixed: 1045916
Change-Id: I1936bb0542bcd531726bf987ef806969ce96d498
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
Currently the remote rx_intent is stored with the primary information.
The transport cannot provide a cookie to be retrieved and used later during
transmission.
Add support to receive a remote rx_intent with a cookie.
CRs-Fixed: 1045916
Change-Id: Id5f204647205b2fde9e5cb422a3ddc8cc4f3a5a0
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
During parallel migration race conditions are seen in remote open
and local open ack function.
Edge based lock is introduced to avoid any race condition during
simultaneous migration. Edge lock is shared across multiple
transport of same edge and is stored in a global list.
CRs-Fixed: 1010920
Change-Id: I2b988d2a6112add06fa433c4b1deeec0b6e6bb58
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
In glink_core_channel_cleaup there is a race condition while
traversing the channels list. This change holds the xprt
channel spinlock during the list manipulation.
CRs-Fixed: 988266
Change-Id: Idcff59ca1483fd98173255d6258e6771d91dec19
Signed-off-by: Chris Lew <clew@codeaurora.org>
Add else statement in glink_close for a race condition where the
xprt state is set to GLINK_XPRT_DOWN and glink_close runs before
the channel is migrated.
CRs-Fixed: 988266
Change-Id: I4de6530f1fbffd9f3acd1fa539cf756364ea32ac
Signed-off-by: Chris Lew <clew@codeaurora.org>
If process_open_event is delayed and glink has migrated to new transport,
process open event will be treated as a new open event and migration will
happen on fully open channel.
If channel is fully, open migration will not be allowed as client might
already be using the channel for communication.
Change-Id: I6c1760bc19f52e7d0c1c9834a72e2304f0ae28c8
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
Add an option to rwref locks that allow the lock functions
to spin when acquiring the lock. Change completion variable
to use waitqueues for sleep functionality.
Change rwref reference function calls to use locking functions
where code reads or writes the context state.
CRs-Fixed: 988266
Change-Id: Ib2908b2495b1b01a6a130033143a7da8e5c0c231
Signed-off-by: Chris Lew <clew@codeaurora.org>
Update the locking hierarchy to reflect the current and
future use-cases. This helps in avoiding deadlock due
to out-of-order locking scenario.
CRs-Fixed: 988266
Change-Id: Ib40da2ecd413e7712cacc9663394e725ebd64a0a
Signed-off-by: Chris Lew <clew@codeaurora.org>
Channel migration logic assumes that the remote & local channel contexts
are always different and exist in different transports. If the remote
& local channel contexts exist in the same transport, then it leads to
a use-after-free scenario.
Fix the channel migration logic by not freeing the channel context if
the local & remote side opens in the same initial transport.
Change-Id: I319a93c49022b08e5c33b561d982a751d5223a58
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
Currently, Rx an Tx is based on workqueue and it is taking significant
time to schedule a workqueue which is hampering performance.
Use tasklet if underlying transport supports atomic context, otherwise
kworker is used.
CRs-Fixed: 978296
Change-Id: I736d2b90730ec10f9dff21944c4ad50e4d87da5c
Signed-off-by: Dhoat Harpal <hdhoat@codeaurora.org>
During subsystem restart, if a transmit operation is waiting for a remote
receive intent acknowledgment, signal the waiter that the receive intent
request will not be acknowledged. Also check the transport and channel
state before waiting for the acknowledgment. This will prevent the
transmit operation from blocking indefinitely under error scenario.
CRs-Fixed: 952184
Change-Id: I29b8215841f7dcca52137f451665eaf339a6f78e
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
The channel is closed locally by the client either as part of SSR or normal
working scenario. The transmit operation does not check the transport or
channel states before queuing the packet for transmit operation. This
causes the transmit operation to access stale transport or channel context.
Check the transport and channel state before queuing the packet for
transmission.
CRs-Fixed: 947627
Change-Id: Ic6f8350b6b5e51b641794255f8520ff4616343bb
Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
During glink_tx(), G-Link can wait for an unlimited amount of time for
the remote side to queue an RX intent. In some cases, e.g. SSR, the wait
must be restricted to a short time, but in the current implementation,
glink_tx() can continue to block indefinitely.
Add a configurable timeout value to the G-Link channel context, which is
set in the channel open configuration. If the value is set to 0, treat
it as an infinite timeout. This allows a timeout to be put in place by
the client for sensitive cases such as SSR where a very limited amount of
time can be spent waiting for an intent.
Change-Id: I1e480fac286d285f871fe3059de7ae761fc4581e
Signed-off-by: Steven Cahail <scahail@codeaurora.org>
All Transports debug logs are captured in one logging context
which makes the debugging difficult and has a chance to miss
the important logs due to other high traffic transports like RPM.
Create separate logging context for each transports for better
debugging.
Change-Id: If2d00966a186dc48badc8a9a2e017eec6895dcad
Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org>